[AINews] ALL of AI Engineering in One Place
This is AI News! an MVP of a service that goes thru all AI discords/Twitters/reddits and summarizes what people are talking about, so that you can keep up without the fatigue. Signing up here opts you in to the real thing when we launch it 🔜
Deep IRL networks are all you need! Jun 25-27 in SF.
AI News for 5/21/2024-5/22/2024. We checked 7 subreddits, 384 Twitters and 29 Discords (380 channels, and 7699 messages) for you. Estimated reading time saved (at 200wpm): 805 minutes.
Lots of nontechnical news - the California Senate passed SB 1047, more explosive news on OpenAI employee contracts from Vox and safetyist resignations, and though Mistral v0.3 was released there's no evals or blogpost to discuss yet.
Given its a technically quiet day, we take the opportunity to share our announcements of the initial wave of AI Engineer World's Fair speakers!
TLDR we're giving a onetime discount to AI News readers: CLICK HERE and enter
AINEWSbefore EOD Friday :)
The AI Engineer World's Fair (Jun 25-27 in SF)
The first Summit was well reviewed and now the new format is 4x bigger, with booths and talks and workshops from:
- Top model labs (OpenAI, DeepMind, Anthropic, Mistral, Cohere, HuggingFace, Adept, Midjourney, Character.ai etc)
- All 3 Big Clouds (Microsoft Azure, Amazon AWS, Google Vertex)
- BigCos putting AI in production (Nvidia, Salesforce, Mastercard, Palo Alto Networks, AXA, Novartis, Discord, Twilio, Tinder, Khan Academy, Sourcegraph, MongoDB, Neo4j, Hasura etc)
- Disruptive startups setting the agenda (Modular aka Chris Lattner, Cognition aka Devin, Anysphere aka Cursor, Perplexity, Groq, Mozilla, Nous Research, Galileo, Unsloth etc)
- The top tools in the AI Engineer landscape (LangChain, LlamaIndex, Instructor, Weights & Biases, Lambda Labs, Neptune, Datastax, Crusoe, Covalent, Qdrant, Baseten, E2B, Octo AI, Gradient AI, LanceDB, Log10, Deepgram, Outlines, Unsloth, Crew AI, Factory AI and many many more)
across 9 tracks of talks: RAG, Multimodality, Evals/Ops (new!), Open Models (new!), CodeGen, GPUs (new!), Agents, AI in the Fortune 500 (new!), and for the first time a dedicated AI Leadership track for VPs of AI, and 50+ workshops and expo sessions covering every AI engineering topic under the sun. Of course, the most important track is the unlisted one: the hallway track, which we are giving lots of love to but can't describe before it happens.
To celebrate the launch of the World's Fair, we're giving a onetime discount to AI News readers: CLICK HERE and enter AINEWS before EOD Friday :)
If the curation here/on Latent Space has the most cosine similarity with your interests, this conference was made for you. See you in SF June 25-27!
The Table of Contents and Discord Summaries have been moved to the web version of this email: !
AI Twitter Recap
all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.
Anthropic's Interpretability Research on Claude 3 Sonnet
- Extracting Interpretable Features: @AnthropicAI used dictionary learning to extract millions of interpretable "features" from Claude 3 Sonnet's activations, corresponding to abstract concepts the model has learned. Many features are multilingual and multimodal.
- Feature Steering to Modify Behavior: @AnthropicAI found that intervening on these features during a forward pass ("feature steering") could reliably modify the model's behavior and outputs in interpretable ways related to the meaning of the feature.
- Safety-Relevant Features: @AnthropicAI identified many "safety-relevant" features corresponding to concerning capabilities or behaviors, like unsafe code, bias, dishonesty, power-seeking, and dangerous/criminal content. Activating these features could induce the model to exhibit those behaviors.
- Preliminary Work, More Research Needed: @AnthropicAI notes this work is preliminary, and while the features seem plausibly relevant to safety applications, much more work is needed to establish practical utility.
- Hiring for Interpretability Team: @AnthropicAI is hiring managers, research scientists, and research engineers for their interpretability team to further this work.
Microsoft's Phi-3 Models
- Phi-3 Small and Medium Released: @_philschmid announced Microsoft has released Phi-3 small (7B) and medium (14B) models under the MIT license, with instruct versions up to 128k context.
- Outperforming Mistral, Llama, GPT-3.5: @_philschmid claims Phi-3 small outperforms Mistral 7B and Llama 3 8B on benchmarks, while Phi-3 medium outperforms GPT-3.5 and Cohere Command R+.
- Training Details: @_philschmid notes the models were trained on 4.8 trillion tokens including synthetic and filtered public datasets with multilingual support, fine-tuned with SFT and DPO. No base models were released.
- Phi-3-Vision Model: Microsoft also released Phi-3-vision with 4.2B parameters, which @rohanpaul_ai notes outperforms larger models like Claude-3 Haiku and Gemini 1.0 Pro V on visual reasoning tasks.
- Benchmarks and Fine-Tuning: Many are eager to benchmark the Phi-3 models and potentially fine-tune them for applications, though @abacaj notes fine-tuning over a chat model can sometimes result in worse performance than the base model.
Perplexity AI Partners with TakoViz for Knowledge Search
- Advanced Knowledge Search with TakoViz: @perplexity_ai announced a partnership with TakoViz to bring advanced knowledge search and visualization to Perplexity users, allowing them to search, juxtapose and share authoritative knowledge cards.
- Authoritative Data Providers: @perplexity_ai notes TakoViz sources knowledge from authoritative data providers with a growing index spanning financial, economic and geopolitical data.
- Interactive Knowledge Cards: @AravSrinivas explains users can now prompt Perplexity to compare data like stock prices or lending over specific time periods, returning interactive knowledge cards.
- Expanding Beyond Summaries: @AravSrinivas says this allows Perplexity to go beyond just summaries and enable granular data queries across timelines, which is now possible from a single search bar.
- Passion for the Partnership: @AravSrinivas expresses his love for working with the TakoViz team and participating in their pre-seed round, noting their customer obsession and the value this integration will bring to Perplexity users.
Miscellaneous
- Karina Nguyen Joins OpenAI: @karinanguyen_ announced she has left Anthropic after 2 years to join OpenAI as a researcher, sharing lessons learned about AI progress, culture, and personal growth.
- Suno Raises $125M for AI Music: @suno_ai_ announced raising $125M to build AI that amplifies human creativity in music production, and is hiring music makers, music lovers and technologists.
- Yann LeCun on LLMs vs Next-Gen AI: @ylecun advises students interested in building next-gen AI systems to not work on LLMs, implying he is working on alternative approaches himself.
- Mistral AI Releases New Base and Instruct Models: @_philschmid shared that Mistral AI released new 7B base and instruct models with extended 32k vocab, function calling support, and Apache 2.0 license.
- Cerebras and Neural Magic Enable Sparse LLMs: @slashML shared a paper from Cerebras and Neural Magic on enabling sparse, foundational LLMs for faster and more efficient pretraining and inference.
AI Reddit Recap
Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity. Comment crawling works now but has lots to improve!
AI Model Releases and Benchmarks
- Microsoft releases Phi-3 models under MIT license: In /r/LocalLLaMA, Microsoft has released their Phi-3 small (7B) and medium (14B) models under the MIT license on Huggingface, including 128k and 4-8k context versions along with a vision model.
- Phi-3 models integrated into llama.cpp and Ollama: The Phi-3 models have been added to the llama.cpp and Ollama frameworks, with benchmarks showing they outperform other models in the 7-14B parameter range.
- Meta may not open source 400B model: According to a leaker on /r/LocalLLaMA, Meta may go back on previous indications and not open source their 400B model, which would disappoint many.
- Benchmark compares 17 LLMs on NL to SQL: A comprehensive benchmark posted on /r/LocalLLaMA compared 17 LLMs including GPT-4 on natural language to SQL tasks, with GPT-4 leading in accuracy and cost but significant performance variation by hosting platform.
AI Hardware and Compute
- Microsoft introduces NPUs for mainstream PCs: Microsoft announced that neural processing units (NPUs) will become mainstream in PCs for AI workloads, with new Surface laptops having an exclusive 64GB RAM option to support large models.
- Overview of M.2 and PCIe NPU accelerators: An overview on /r/LocalLLaMA looked at the current landscape of M.2 and PCIe NPU accelerator cards, noting most are still limited in memory bandwidth compared to GPUs but the space is evolving rapidly.
AI Concerns and Regulation
- Europe passes AI Act regulating AI development and use: The EU has passed the comprehensive AI Act which will regulate the development and use of AI systems starting in 2026 and likely influence regulation globally.
- California Senate passes AI safety and innovation bill: The California Senate has passed SB1047 to promote AI safety and innovation, with mixed reactions and some concerns it will limit AI progress in the state.
- TED head calls Meta "reckless" for open sourcing AI: Chris Anderson, head of TED, called Meta "reckless" for open sourcing AI models, a concerning stance to AI progress advocates from an influential figure.
AI Assistants and Agents
- Microsoft introduces Copilot AI agent capabilities: Microsoft announced new agent capabilities for Copilot that can act as virtual employees, with early previews showing ability to automate complex workflows.
- Demo showcases real-time multimodal AI game agents: A demo posted on /r/singularity showcased real-time multimodal AI agents assisting in video games by perceiving game state visually and providing strategic guidance.
- Questions raised about Amazon's lack of AI assistant progress: /r/singularity discussed Amazon's apparent lack of progress in AI assistants compared to other tech giants, given their broad consumer reach with Alexa.
Memes and Humor
- Memes highlight rapid AI progress: Memes and jokes circulated about the rapid pace of AI progress, companies making dramatic claims, and concerns about advanced AI systems.
AI Discord Recap
A summary of Summaries of Summaries
-
LLM Benchmarking and Performance Optimization:
- Microsoft's Phi-3 Models offer high context lengths and robust performance, stirring discussions on benchmarks and memory usage but uncovering compatibility issues in tools like llama.cpp.
- Various techniques like torch.compile and specific GPU setups were debated for optimizing computation efficiency, shared via insights like those in tensor reshaping examples.
-
Open-Source AI Tools and Frameworks:
- The Axolotl framework emerged as a go-to for fine-tuning models like Llama and Mistral, with Docker setups facilitating ease of use (quickstart guide).
- LlamaIndex introduced techniques for document parsing and batch inference, integrating GPT-4o's capabilities to enhance complex document manipulation and query accuracy.
-
AI Legislation and Community Responses:
- California's SB 1047 bill prompted heated debates on the impact of new regulations on open-source models, with concerns over stifling innovation and favoritism towards major incumbents.
- Discussions on ethical and legal questions arose around AI voice replication, highlighted by OpenAI's controversial mimicking of Scarlett Johansson's voice, leading to its subsequent removal after public backlash.
-
Novel AI Model Releases and Analysis:
- Community excitement surrounded new releases such as Mistral-7B v0.3 with extended vocabularies and function calling (details), while Moondream2 updates improved resolution and accuracy in visual question-answering.
- Anthropic's work on interpretable machine learning and the release of Phi-3 Vision spurred deep dives into scaling monosemanticity (research) and practical AI applications.
-
Practical AI Implementations and Challenges:
- Members shared practical AI implementations, from PDF extraction with Surya OCR transforming documents into markdown (GitHub repo), to building secure code execution environments on Azure (dynamic sessions).
- The LangChain community highlighted issues with deployment and endpoint consistency, with detailed troubleshooting on the GitHub repo helping streamline deployment processes and enhance chatbot functionalities.
The full channel by channel breakdowns have been truncated for email.
If you want the full breakdown, please visit the web version of this email: !
If you enjoyed AInews, please share with a friend! Thanks in advance!