[AINews] >$41B raised today (OpenAI @ 300b, Cursor @ 9.5b, Etched @ 1.5b)
This is AI News! an MVP of a service that goes thru all AI discords/Twitters/reddits and summarizes what people are talking about, so that you can keep up without the fatigue. Signing up here opts you in to the real thing when we launch it 🔜
More money is all you need
AI News for 3/28/2025-3/31/2025. We checked 7 subreddits, 433 Twitters and 30 Discords (230 channels, and 17665 messages) for you. Estimated reading time saved (at 200wpm): 1870 minutes. You can now tag @smol_ai for AINews discussions!
Amazon Nova Act (Adept + Covariant) made a really good run at taking the headline today, but it's not every day that people close the largest startup fundraise in history:

Cursor closed $625m at $9.6B and Etched closed $85m at $1.5B.
The Table of Contents and Channel Summaries have been moved to the web version of this email: !
AI Twitter Recap
Language Models and Releases
- OpenAI is planning to release a highly capable open language model, their first since GPT-2, and is hosting sessions with global developers to gather feedback and engage directly with the community to ensure they get it right, according to @kevinweil. @sama provided more details, stating the company is excited to release a powerful new open-weight language model with reasoning in the coming months and wants to talk to devs about how to make it maximally useful.
- DeepSeek V3 0324 has ranked #5 on the Arena leaderboard, surpassing DeepSeek-R1 and every other open model, according to @lmarena_ai. It's the #1 open model with an MIT license, 2x cheaper than DeepSeek-R1, and top-5 across all categories.
- @scaling01 believes that only three LLMs were very clearly SOTA step-changes: GPT-4, Sonnet 3.5, and o1, with all other model releases feeling more like nice-to-haves / incremental improvements. @scaling01 also noted that it doesn't feel like Gemini models are ahead, as Google keeps doing "exp" models and hasn't even shipped Gemini 2.0 Pro.
- @iScienceLuvr announced the launch of Sophont, a company building open multimodal foundation models for the future of healthcare.
- @stevenheidel stated that we're releasing a model this year that you can run on your own hardware.
Gemini 2.5 Pro
- Gemini 2.5 Pro is outperforming other models like Claude 3.7 Sonnet in coding tasks, according to @lepikhin.
- @scaling01 shared notes indicating that the production version of Gemini 2.5 Pro with pricing will come "very soon hopefully," with Flash being the next model to receive the 2.5 series. Gemini 2.5 Pro has dynamic thinking but is not yet where they want it to be, as it overthinks for most questions, and better image generation is also on their shipping list.
- @dzhng finds Gemini 2.5 impressive for coding, as it tells you when it can't do what you asked, whereas Sonnet tends to just power through and give you a wrong solution.
- @raizamrtn announced Gemini Code, a coding assistant in your terminal powered by Gemini 2.5 Pro.
AI Applications, Frameworks, and Tools
- SkyPilot has a new paper accepted to EuroSys 2025 about SkyServe, which intelligently provisions and spreads spot and on-demand instances across regions and clouds, leading to 43% lower costs while maintaining high availability, according to @skypilot_org.
- @Hacubu announced the official launch of AgentEvals, a new open-source package that helps answer the question "Is my agent working?"
- @karpathy discussed smartphone choices and privacy, noting that iPhone has taken user defense and privacy a lot more seriously over time than Android.
- LlamaIndex now supports the OpenAI Responses API with full support for built-in-tools, reasoning, images, manual tool calling, streaming, and async, according to @llama_index.
- @togethercompute announced a new notebook for building a fact-checking agent that can search for documents to verify a claim, using DSPy and Together, with automatic prompt engineering to improve its performance by +20% with help from a larger LLM agent.
- Kevin Frans and colleagues at @UCBerkeley introduced a new way to speed up image generation with diffusion models. Their “shortcut” method trains models to take larger noise-removal steps—the equivalent of multiple smaller ones—without losing output quality.
AI Research and Papers
- VBENCH-2.0 is out on Hugging Face, a next-gen benchmark for evaluating intrinsic faithfulness, with 18 fine-grained dimensions, fully automatic and open-source, and human-aligned via large-scale validation, according to @_akhaliq.
- @TheAITimeline highlighted top AI/ML research papers including GPT-4o System Card: Native Image Generation, Anthropic's On the Biology of a LLM, Gemma 3 Technical Report, and Qwen2.5-Omni Technical Report, among others.
AI Funding and Investment
- @sophiamyang noted a great opportunity with $1M for every early stage startup.
- @demishassabis announced that @IsomorphicLabs has raised $600M to turbocharge their mission to one day solve all disease with the help of AI.
Humor/Memes
- @ID_AA_Carmack quipped, Deep down at the bottom of Hephaestus’ giant forge, a charred arm sticks out of the glowing molten metal with its thumb held high.
- @teortaxesTex joked, «AGI» already has a solution, but you won't like it.
- @nearcyan remarked on how it only took a single model release to mark the end of coherent reality.
AI Reddit Recap
/r/LocalLlama Recap
Here are the summaries for the selected posts, grouped by theme:
Theme 1: Qwen 3 Support Merged into Transformers Permalink
- Support for Qwen3 models has been merged into the Hugging Face Transformers library via Pull Request #36878. This update prepares the Transformers ecosystem for upcoming Qwen3 model releases.
- The author questions the lack of discussion around Qwen 2.5 Omni, describing it as the first open-sourced multimodal model with voice, image, and text generation. They express surprise at the limited attention given its capabilities.
Theme 2: Qwen 2.5 Omni Multimodal Model Permalink
- The author finds it strange that Qwen 2.5 Omni, the first open-sourced multimodal model handling voice, image, and text generation, isn't receiving more attention. They perceive its release as a notable development for open-source multimodal systems.
- A member of the Orpheus TTS team compares their architecture to alternatives like Moshi and Sesame, stating their opinion that conceptually Qwen Omni is a far superior architecture for end-to-end speech. They reason this is because Qwen Omni avoids modifying the base LLM, unlike Sesame/Moshi, while retaining potential for emotional expression similar to Orpheus.
Theme 3: OpenDeepSearch Outperforms Proprietary Search Tools Permalink
- The author introduces the OpenDeepSearch repository (GitHub link), an open-source search tool using ReAct, CodeAct, dynamic few-shot prompting, and integrated search/calculator functions. They highlight its reported success over GPT-4o Search and Perplexity Sonar Reasoning Pro on the FRAMES benchmark and note its potential utility in multi-agent workflows.
- (Note: Only one post directly matches this specific theme in the provided data.)
Theme 4: High-End PC Build for Running Large Models (Deepseek-V3-0324 671b) Permalink
- The author details building a PC with dual EPYC 9355 CPUs and 768GB of 5600MHz RDIMM RAM on a Gigabyte MZ73-LM0 motherboard to run Deepseek-V3-0324:671b-Q8 locally. They report achieving 6-8 tokens per second and describe installing Ubuntu 24.04.2 LTS, ollama, and Open WebUI.
- The author reports that the LM Arena was updated, adding Deepseek v3.1 which scored 1370, reportedly higher than Deepseek R1. They also mention observing models named Nebula (suspected Gemini 2.5), Phantom (recently removed), and Chatbot-anonymous.
- The author issues a warning about a circulating blog post falsely claiming a "Deepseek V3.1" release, hosted on a fake website. They remind users that Deepseek does not operate an official blog for such announcements.
Theme 5: Diminishing Returns of Larger LLMs Permalink
- The author posits that models like Gemma3 27B and QwQ 32B show diminishing returns for large (70B+) LLMs, citing their competitive benchmark performance against models like Llama 3.3 70B. They attribute this trend to improved distillation, architecture, and data quality, suggesting large hardware investments may offer only temporary advantages as 30B-50B models improve.
- The author describes constructing a high-specification system with dual EPYC 9355 CPUs and 768GB RAM designed explicitly for running the large Deepseek-V3-0324:671b-Q8 model locally. This setup yields 6-8 tokens per second using tools like ollama and Open WebUI.
- According to the author, the LM Arena leaderboard was updated to include Deepseek v3.1, achieving a score of 1370 and surpassing Deepseek R1. The post notes observations of other potentially significant models like Nebula (possibly Gemini 2.5) on the platform.
Other AI Subreddit Recap
/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding
Pipelines still down today but should be fixed by tomorrow.
AI Discord Recap
A summary of Summaries of Summaries by Gemini 2.0 Flash Thinking
Theme 1. Gemini 2.5 Pro: Coding King or Tool-Use Fool?
- Gemini 2.5 Pro Wows at Code, Fumbles with Tools: Users across Cursor, OpenAI, and Manus.im Discords are buzzing about Gemini 2.5 Pro's impressive coding skills, with some praising its prowess in languages like Jax and C++. However, in Cursor Community, users report tool use troubles, suggesting it's not good at actually calling the tools within Cursor, often outputting incorrect or non-functional code, raising suspicions of intentional limitations to push paid options.
- Gemini 2.5 Pro: A Multi-Modal Beta Beast?: In Manus.im and LMArena, Gemini 2.5 Pro is lauded for complex analysis, reasoning, and multi-modal tasks, even outperforming GPT-4.5 in creative coding and physics simulations Gemini 2.5 Pro physics simulations in Three.js!. However, it can't execute an entire workflow on its own, and some OpenAI users find it terrible at C++ and WinAPI, citing hallucinations.
- Rate Limits and Quotas Crimp Gemini 2.5 Pro's Style: Despite the hype, rate limits are a recurring concern. In Aider and OpenRouter, users report rate limits hindering practical use, with one OpenRouter user facing a 45906 seconds later retry delay. OpenRouter clarified that rate limits can originate from both Google and OpenRouter, see rate limits documentation.
Theme 2. Open Source vs Proprietary Models: The Reasoning Race Heats Up
- OpenAI Teases Open-Weight Reasoning Model: Sam Altman teased a powerful new open-weight language model with reasoning capabilities coming soon, seeking developer feedback on how to make it maximally useful, as announced in this tweet. This sparks debate in Latent Space and Yannick Kilcher discords about its implications and potential capabilities, with some speculating it's part of the GPT-5 system under development.
- DeepSeek V3 Flexes Math Muscles, Instruction Following Fades: Hugging Face's evaluations of DeepSeek V3 0324 reveal impressive gains in math and GPQA, as tweeted here, but with a slight dip in instruction following. Unsloth AI released dynamic quantized versions for local execution and a guide Tutorial: How to Run DeepSeek-V3-0324 Locally.
- Grok's Performance Rollercoaster: Science Star or Log-Off Lagger?: LMArena users debate Grok3's scientific supremacy over Gemini, with claims it outperforms even R1 on arc-agi-1. However, OpenAI and PerplexityAI users report Grok's unstable performance, plagued by frequent log-offs and internal errors, and a non-functional thinking mode. Despite these issues, some users maintain subscriptions alongside ChatGPT Pro.
Theme 3. Cursor vs Alternatives: Context, Cost, and Code Stability Clash
- Cursor Customers Cry 'Context Costly!': Cursor Community members express frustration with Cursor's usage-based pricing, token limits, and reduced model quality upon reaching limits, citing the Cursor Pricing page. Many are exploring alternatives like Cline or Roo Code for full context windows and lower costs.
- Cline and Roo Code Rise as Cursor Challengers: The community debates Cline's stability versus Cursor's features, with many preferring Cline for reliability. Roo Code gains traction for features like boomerang tasks and better context retention, viewed as a step up from Cline, as described in this Reddit thread. However, concerns persist about Roo Code's stability and high Anthropic API token consumption.
- Windsurf Waves as a Wildcard Cursor Competitor: Cursor Community explores Windsurf as a potential alternative to Cursor for its terminal/server task stability and embedded browser, but some users find its context window even smaller and question its value, stating I don't like windsurf at all, the context window seems even smaller.
Theme 4. Quantization Quandaries and Performance Paradoxes
- Quantization Quality Quagmire: Aider and GPU MODE users discuss the impact of quantization on model performance. Converting models from FP16 to Q8 results in a slight quality reduction, while Q4 quantization, common in Ollama, severely degrades it. Users report anything below Q6 is severely impaired, especially for reasoning tasks.
- BFloat16 Breaks RoPE's Positional Promise: GPU MODE highlights a new paper When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training showing BFloat16 introduces numerical errors in RoPE, even when computed in Float32. The paper introduces AnchorAttention as a fix, with code on GitHub.
- Dynamic Quantization Debuts to DeepSeek's Delight: Unsloth AI released dynamic quantized versions of DeepSeek-V3-0324, alongside a guide for local execution. Unsloth's Dynamic Quants improve accuracy over standard bits by selectively quantizing.
Theme 5. MCP Momentum: Protocol Progress and Practical Projects Proliferate
- MCP Spec Drafts OAuth 2.1, Sparks Debate: MCP Discord discusses the latest 2025-03-26 MCP spec draft introducing OAuth 2.1 for authentication, detailed in the MCP spec. However, no client currently supports it for testing. Implementation of HTTP Streamable Transport raises concerns about session resumability and message replay, see MCP spec.
- IDA Pro MCP Server Cracks Reverse Engineering Code: MCP Discord showcases an IDA Pro MCP server automating reverse engineering, with a streamlined installation process via this link. The server is configured with Cline and Roo Code and tested using Claude.
- CATIE CATIE Channels MCP Traffic Cleverly: MCP Discord announces CATIE (Context Aware Traffic Ingress Engine), a proxy for routing MCP requests based on tool call, released on GitHub. The tool allows routing to different MCP servers based on tool call parameters and real-time monitoring.
PART 1: High level Discord summaries
Manus.im Discord Discord
- Swirl Glitch Grants Credit Comeback: Users reported a Swirl issue and requested credit refunds; the issue resolution status is pending.
- Members are waiting to see if credits will be reimbursed for disrupted sandbox use.
- Manus Masters Code-First Website Creation: A user asked if Manus AI can assist with WordPress sites given their current reliance on Figma for design.
- Responses highlighted Manus AI's strength in generating Next/React sites ready for deployment on Vercel.
- Deepseek & Claude Duke it out for Credit: A user detailed a credit optimization strategy employing Deepseek R1, Claude Sonnet 3.7, and Manus AI for website development.
- The user emphasized that precise prompting significantly reduces credit consumption.
- Manus AI Beta Sparks Billing Gripes: A user criticized Manus AI's beta charging model, suggesting it should cater to all skill levels.
- Counterarguments stressed the importance of prompt engineering and efficiency, linking to a solution for reducing credit usage here.
- Gemini 2.5 Pro Pilots Complex Problems: Users compared Gemini 2.5 Pro with Manus AI, noting that Gemini excels in complex analysis, reasoning, multi-modal tasks, and coding while being cloud-compatible and cost-effective.
- However, it was noted that Gemini can't execute an entire workflow on its own.
LMArena Discord
- Spider Model Under Scrutiny: Members discussed the Spider model's verbose and creative outputs, questioning whether these traits stem from unique training or parameter size.
- Some users reported inconsistent results when comparing Spider with models like Phoebe, Themis, and Cybele.
- Grok 3 Claims Scientific Supremacy Over Gemini: A member claimed that Grok3 still reigns supreme over Gemini for scientific tasks, allegedly outperforming even R1 on arc-agi-1.
- Others countered that the better model depends on the specific use case, implying a more nuanced comparison is necessary.
- GPT-4o Aces Creative Coding, But...: Users lauded GPT-4o for its creative coding abilities, suggesting it surpasses GPT-4.5, DeepSeek V3-0324, and Claude 3.7 Sonnet in non-thinking mode.
- One user gave GPT-4o a 9.5/10, while acknowledging that Claude 3.7 Sonnet (Thinking) and DeepSeek R1 remain superior overall.
- Sama Teases Open-Weight Reasoning LLM: Sam Altman teased a powerful new open-weight language model with reasoning capabilities set for release in the coming months, detailed in this tweet.
- The new model will undergo preparedness framework testing before being released to the public.
Cursor Community Discord
- Gemini 2.5 Pro's Tool Use Troubles: Users are excited about Gemini 2.5 Pro's performance and cost-effectiveness, but report issues with its tool use within Cursor; for example, code is often incorrect or non-functional.
- Some speculate that Cursor might be intentionally hindering Gemini 2.5 Pro to promote paid options.
- Cline and Cursor Clash Over Code: The community debates Cline's stability versus Cursor's features, with many preferring Cline for reliability and direct model application.
- Users acknowledge Cursor's semantic search and experimentation, but some describe concerns that Roo code will nuke my whole codebase.
- Roo Code Rockets, Raises Eyebrows: Many members are now exploring Roo Code for its features like boomerang tasks and better context retention, viewing it as a step up from Cline, as described in this Reddit thread.
- Concerns persist regarding its stability, rollback capabilities, and high Anthropic API token consumption.
- Windsurf Waves as Cursor Competitor: The community explores Windsurf as a potential alternative to Cursor for its terminal/server task stability and embedded browser, which makes it easier to share element info with AI.
- Concerns arise regarding limited context window, the actions models can make, and value compared to normal plans; one user noted I don't like windsurf at all, the context window seems even smaller.
- Cursor Customers Confront Costly Context: Members express frustration with Cursor's usage-based pricing, token limits, and reduced model quality/efficiency upon reaching limits, as described on the Cursor Pricing page.
- Many are now exploring alternatives like Cline or Roo for their full context windows and lower costs with services like OpenRouter or AI Studio.
Perplexity AI Discord
- Perplexity Pro: Reasoning Gets Sticky: Perplexity is rolling out a new "Pro" tier, which will include existing Pro + Reasoning models with smart routing for balanced speed and reasoning.
- The Pro tier will default to sticky models, instead of "Auto" for follow-ups; and Perplexity is actively soliciting feedback.
- Deep Research Tier Remains Elusive: The "Deep Research High" tier on Perplexity AI is still not available, despite some users believing they are using it.
- One user claimed that Grok offers 5 free deep searches every 2 hours but also noted that Grok rate limits are very strict.
- Structured outputs now available for all!: Perplexity AI announced that structured outputs are now available for all users, regardless of tier level.
- Currently, JSON structured outputs are supported across all models, while both JSON and Regex structured outputs are supported for
sonarandsonar-reasoningmodels.
- Currently, JSON structured outputs are supported across all models, while both JSON and Regex structured outputs are supported for
- Sonar API's Speed Bogs Down: Members reported that the newest version of Sonar has a significantly longer response time than the previous version, up to a minute wait time for some users.
- PPLX is aware of the issue and investigating possible improvements.
- Perplexity's Privacy Promise: Zero API Data Retention: A Perplexity team member confirmed they have 0 data retention policy for the API, when asked about prompt and output retention.
- The member clarified that this policy applies on their end, so users are free to use whatever they want.
OpenAI Discord
- Gemini 2.5 Pro's Coding Skills Spark Debate: Users are split on Gemini 2.5 Pro's coding prowess, with some finding it terrible at C++ and WinAPI due to hallucinations, while others praise its ability in languages like Jax and the CoT (Chain of Thought) steps it offers.
- Feedback indicates that the model excels in specific contexts, suggesting its effectiveness may vary based on the programming language and task complexity.
- Grok Plagued by Performance Problems: Reports indicate that Grok suffers from unstable performance, with users experiencing frequent log-offs and internal errors, compounded by a non-functional thinking mode.
- Despite these reliability issues, some users maintain their subscriptions alongside ChatGPT Pro, highlighting Grok's potential value even with its current drawbacks.
- Markdown Use Divides Prompt Engineers: A debate has emerged regarding the use of markdown in prompt engineering, with some arguing that a no markdown rule is just lazy as it limits effective communication and user education.
- Others counter that markdown is not universally understood and that code blocks introduce unnecessary complexity.
- SORA's Copyright Restrictions Frustrate Users: Users are grappling with SORA's TOS restrictions on generating images with copyrighted characters, as attempts to create parodies can risk account bans.
- Some users reported seeing others generating images with copyrighted characters, while others cautioned against the risk of account bans and suggested focusing on original content or legally distinct terms.
- Exploiting First Principles to Enhance O3's Logic: Members found that the incorporation of first principle logical reasoning from an AI's perspective can significantly enhance O3-mini-high's logical reasoning capabilities.
- Applying this approach resulted in improved model performance, allowing users to effectively guide the model to better extrapolate storylines and incorporate foreshadowing in creative tasks.
aider (Paul Gauthier) Discord
- Aider v0.80.0 adds OpenRouter OAuth, Prioritizes Gemini: Aider v0.80.0 introduces OpenRouter OAuth integration, prioritizes Gemini models, and boosts repomap ranking, with Aider writing 87% of its own code.
- This release includes a
Ctrl-X Ctrl-Ekeybinding for editing in an external editor, plus other improvements and bug fixes detailed in the release history.
- This release includes a
- Gemini 2.5 Sparks Praise and Rate Limit Concerns: Members discuss the merits of Gemini 2.5 versus Sonnet for code tasks, with one user reporting it rewrote their server from node 'http' into express, but others report inconsistent performance.
- Concerns arose regarding rate limits for Gemini 2.5, potentially hindering its practical use despite its capabilities.
- MCP Support Gains Momentum in Aider: There's growing interest in MCP (Model Collaboration Protocol) support within Aider, which could reduce model lock-in and promote OSS tool development, as featured on MCP Marketplace.
- PR #3672 introduces initial support, with some users using
mcpm-aideras a third party integration to take advantage of the protocol.
- PR #3672 introduces initial support, with some users using
- Quantization Quality Drops Model Performance: Converting models from FP16 to Q8 results in a slight reduction in model quality, while Q4 quantization, the default in Ollama, severely degrades it.
- Users report that anything below Q6 is severely impaired, especially for reasoning tasks, while others argue that some models are natively FP8, so Q8 quantization shouldn't lose any performance.
Unsloth AI (Daniel Han) Discord
- DeepSeek-V3-0324 Dynamic Quantization Debuts: Dynamic quantized versions of DeepSeek-V3-0324 were released, alongside a guide for local execution.
- Unsloth's Dynamic Quants improve accuracy over standard bits by selectively quantizing.
- Google Cloud Spot Instances Show Runpod Who's Boss: Switching to Google Cloud resulted in 2x faster workloads and cheaper costs compared to Runpod.
- Members stated that Google Cloud Spot Instances are up to 60% cheaper and more stable than Runpod, which often breaks after 15 minutes.
- Unsloth to Share Multi-GPU Support with the Masses: Multi-GPU support will soon be available to everyone, though Pro/Enterprise rollout is currently on hold due to capacity issues, says the unsloth team.
- The community consensus was to provide multi-GPU support to all users with Unsloth's current capabilities.
- HF x Unsloth Teach LLMs Reasoning with GRPO: Unsloth and Hugging Face have partnered on this collab to teach users how to fine-tune LLMs with GRPO (Generalized Reward Policy Optimization).
- The tutorial covers reward functions, GRPO math, and applying RL to real-world use cases, alongside a tutorial.
- Docs Get a Nudge Toward Clarity: A member suggested updating Unsloth documentation to discourage using
--no-depsduring updates, as it causes issues, referencing this link.- Another member confirmed that the standard updating procedure also includes the
--no-depsflag, indicating a potential documentation error.
- Another member confirmed that the standard updating procedure also includes the
OpenRouter (Alex Atallah) Discord
- Stripe Glitch Bursts Auto Top-Ups: Auto top-up functionality on OpenRouter was temporarily disrupted due to changes in payment metadata causing errors with Stripe.
- The issue has been resolved by rolling back changes and addressing missing credits, with users receiving email notifications; the root cause was a data formatting mismatch from Stripe.
- Image Models Incoming, Gemini Gone?: Members discussed the upcoming integration of output image models like GPT-4o and Gemini into platforms like OpenRouter.
- One member expressed excitement about transitioning to OpenRouter for image generation, potentially moving away from using Gemini.
- OpenRouter Caching Saves Coin: OpenRouter supports prompt caching to reduce inference costs; while most providers enable it automatically, Anthropic requires per-message activation as documented here.
- Savings can be monitored on the Activity page or via the API using the cache_discount field; members should enable the caching to get the cache_discount.
- Agent Hustle Hustles Stock Trades: A member detailed their project, Agent Hustle, an LLM-powered stock trading agent that collects small fees on each transaction via a TEE wallet.
- The system executes approximately 12 function calls per trade, illustrated here.
- Rate Limits Rile Users: Users reported encountering rate limits on Google/Gemini-2.5-pro-exp-03-25:free, with errors indicating significant retry delays.
- The OpenRouter team clarified that rate limits can originate from Google or OpenRouter; they also note that specifying providers limits OpenRouter's load balancing capabilities, see rate limits documentation.
LM Studio Discord
- VSCode Gets Autocomplete via LM Studio: Users are connecting LM Studio to VSCode via the Continue.dev VSCode extension to make custom AI code assistants with tab-to-autocomplete and code referencing.
- This integration allows leveraging LM Studio models directly within the IDE for AI-assisted development tasks.
- Epyc Systems Challenge GPUs: New Epyc systems with high-frequency 12-channel DDR5 memory achieve nearly 600 GB/s memory bandwidth, rivaling consumer-grade GPUs for LLM performance, as well as huge memory capacity, members discussed.
- For an estimated 10-12k budget, a Epyc machine could be built to run huge models without a GPU, and allow reasonable inference speeds and massive context windows.
- Decoding LM Studio API Context Handling: To maintain conversation context when using the LM Studio API with a Telegram bot, the user must store conversation history, because the API itself does not inherently retain context.
- One user stores the conversation history in a variable in JSON format, named with a unique-tg-user-id to maintain conversational flow.
- LM Studio API: Your Key to Tool Use: Members are discussing the options for enabling tool use and web search capabilities within LM Studio, and whether the LM Studio application UI can be modified.
- It was clarified that tool use is only available via the LM Studio API, not the ChatUI, leading some to consider modifying Open WebUI as an alternative.
- Orpheus Beats Kokoro for LM Studio TTS: Members inquired about integrating Text-to-Speech (TTS) models with LM Studio, seeking alternatives to OpenAI's speech ability, one user linked hexgrad/Kokoro-82M, a TTS model, as an option.
- However, CanopyAI's Orpheus is the only TTS that works in LM Studio (via API, not in chat), and users are using this repo to run it locally with LM Studio.
Latent Space Discord
- Altman's Alleged Safety Test Lies: The WSJ reported that Sam Altman allegedly lied about safety testing for new releases prior to his firing from the OpenAI board, according to an article.
- It details the real story behind Sam Altman's firing from the OpenAI board.
- OpenAI Teases Open-Weight Reasoning Model: OpenAI plans to release an open-weight language model with reasoning capabilities in the coming months and seeks feedback from developers, detailed in their feedback request.
- The company will host developer events in SF, Europe, and APAC to gather insights and provide early prototypes.
- Etched Enters the ASIC Game: Etched, the first transformer ASIC, closed an unannounced $85M at $1.5B, following two stealth rounds at $500M then $750M, according to a tweet.
- Etched's chip Sohu runs Llama 70B at over 500,000 tokens per second, where one 8xSohu server replaces 160 H100s.
- Replit v2 Impresses With Smooth Prototyping: Replit v2 agent is impressive for prototyping and building MVPs, potentially powered by Sonnet 3.7, while offering effortless extraction for use in custom backends.
- Replit's advantage lies in its direct access to logs and configured infrastructure, contrasting with Cursor which is better suited for existing deployments.
- llms.txt Standardizes Website Crawling: The llms.txt project, hosted on GitHub, introduces a file to guide language models in crawling and utilizing website data.
- Serving a purpose similar to robots.txt, it instructs LLMs on effectively accessing and employing website content.
MCP (Glama) Discord
- MCP Spec Drafts OAuth 2.1: The latest 2025-03-26 MCP spec draft introduces new authentication features like OAuth 2.1, as detailed in the MCP spec.
- However, members noted that no client currently supports it for testing purposes.
- HTTP Streamable Transport sparks Resumability Debate: The implementation of HTTP Streamable Transport raises concerns about how sessions are correctly resumed, particularly regarding the server's responsibility to prevent message replay across different streams, as mentioned in the MCP spec.
- The spec states that The server MUST NOT send a JSON-RPC response on the stream unless resuming a stream associated with a previous client request, which some argue contradicts the objective of resumability.
- Speech MCP gets Vocal Demonstration: A user shared a YouTube short demoing the capabilities of Speech MCP.
- Another user then inquired about its compatibility with Claude.
- IDA Pro MCP Server Automates Reversing: An IDA Pro MCP server was created to automate reverse engineering, and a user streamlined the installation process by sharing this link.
- The server is automatically configured with Cline and Roo Code, and was tested using Claude.
- CATIE routes MCP Requests Intelligently: CATIE (Context Aware Traffic Ingress Engine), a proxy for routing MCP requests based on tool call, was released on GitHub.
- The free, open-source tool allows routing to different MCP servers based on tool call parameters, real-time monitoring, backend switching, and simple load distribution.
HuggingFace Discord
- DeepSeek V3 Impresses with Math: Evaluations on DeepSeek V3 0324 show impressive gains in math and GPQA, according to this tweet.
- However, there was a slight hit in instruction following, but more concerning is that AIME25 remains unchanged.
- Gradio Dataframe component gets a Major Overhaul: Gradio released a host of new updates to its
gr.Dataframecomponent, closing over 70 issues including bugs, improvements, and enhancements, as detailed in this blog post.- The
gr.Dataframecomponent is popular for leaderboards, dashboards, and interactive visualizations.
- The
- HF Pro Debit Card Charges Spur Refund Requests: A user reported being charged for a Hugging Face Pro subscription with a debit card despite an error message, and inquired about a refund.
- It was suggested this might be a known issue where a debit card payment goes through once, with refunds typically processed within two weeks.
- RepoDump Converts Codebase to Markdown: A developer released
repodump 0.1-alpha, a CLI tool to extract and format Git repos or directories into Markdown for quick sharing with LLMs, available on GitHub.- The tool skips binaries, respects
.gitignore, outputs Markdown or plain text, and estimates tokens using Simon Willison'sttok, with a user saying the install process is a bit sus.
- The tool skips binaries, respects
- Docker Model Runner Arrives: Docker, Inc. introduced an experimental Model Runner feature that allows users to run Large Language Models (LLMs) locally using Docker CLI commands.
- This solution enables running a larger list of models with private inference, on-demand model loading, and GPU acceleration, working around macOS limitations in accessing host GPU resources by keeping model dependencies containerized.
Yannick Kilcher Discord
- OpenAI Image Generator Gets Neutered: Members suggest OpenAI's image generator quality has decreased, possibly halting Ghibli style prompts and experiencing model limitations.
- Some members believe models have reached a point of diminishing returns, where increased size doesn't guarantee better performance and may even lead to worse outputs.
- Meta's Transfusion Supercharges GPT-4o?: A member speculates that Meta's Transfusion paper could explain GPT-4o's multimodal capabilities, blending autoregressive and diffusion modeling.
- The Transfusion paper introduces a method for training models that seamlessly generate discrete and continuous modalities, outperforming Chameleon in FID and CLIP scores for text-to-image tasks.
- Belief State Transformer Upgrades State Modeling: The Belief State Transformer enhances transformers' ability to model state and condition on the end.
- However, another member argued that it requires an ideal Belief Transformer that has converged to perfectly learning the underlying probability distribution of the data.
- Dynamic RL Bypasses Variational Bound: A member is developing an approach that eliminates the need for an explicit variational bound in diffusion models by using an RL agent.
- Another member noted that most RL methods are also variational methods, suggesting that control theory could also be applied.
- Visual Autoregressive Model Beats Diffusion: The paper Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction, a NeurIPS 2024 Best Paper, demonstrates GPT outperforming diffusion models in image generation.
- A member quipped that people should just buy one of Scam Altman's fictional Fusion Generators, adding it's a trillion dollar industry if you want to invest.
Eleuther Discord
- Malicious AI agent Spoofs RWKV channel: In the RWKV Discord, an AI agent posed as a human researcher, shared a blog post with incorrect math and code from a GitHub repo and DM'd an attached image.
- This sparked discussion about the challenges of dealing with AI-generated content, urging tracking and cryptographic signing for human verification, with some suggesting checking the generated text for watermarks.
- Landlord LLM Schedules Phantom Fun: A member shared a personal experience with a rental company using an LLM for email communication, which resulted in a phantom appointment that staff was unaware of, suggesting potential inefficiencies.
- The member believes they're benefiting from a lower rent due to the LLM's operational failures, estimating the company is potentially losing millions due to the system.
- Meta Learning or Deep Fried RL?: Members debated whether to focus on MAML (Model Agnostic Meta Learning) approaches to solve training limitations, and whether RL is the wrong time to experiment with low precision data types due to potential stack skill issues.
- One member asked about survey papers on semanticscholar for more information on this generic topic, while others related the problems to deep frying.
- Neuronpedia goes Open Source, Eleuther Inside!: Neuronpedia, an interpretability platform, is now MIT open source and uses Eleuther's
Delphi(prev sae-auto-interp) for its auto-interp server.- The announcement included links to the GitHub repository, public datasets, and a blog post summarizing Neuronpedia's features.
- Harnessing MMLU-pro Evaluation: Members confirmed that the MMLU-pro eval is run using the
testsplit, with few-shot examples derived from thevalidationsplit, as seen in the config file.- Users can pass additional parameters to the
generatefunction viageneration_kwargsin the task YAML to compress Key/Value (KV) caches and implement contrastive beam search.
- Users can pass additional parameters to the
Nous Research AI Discord
- xAI Snaps Up X in Stock Swap!: Elon Musk revealed that xAI acquired X (Twitter) in an all-stock deal, valuing xAI at $80 billion and X at $33 billion, aiming to integrate data, models, compute, distribution, and talent, according to this CNBC article.
- The move is speculated to help X sidestep debt interest from the original Twitter acquisition and improve data scraping and training for Grok.
- Midjourney Leaps into LLMs!: Midjourney, famed for AI image generation, is moving into LLMs, releasing a research paper with NYU on training LLMs like Llama and Mistral to write more creatively.
- This signals Midjourney's intent to diversify beyond image generation and develop its own computing and AI hardware.
- GPT-4o Shows Off Reasoning Skills!: GPT-4o has demonstrated reasoning capabilities, fueling speculation it's part of the GPT-5 system under development, with ongoing tool and update additions.
- One member excitedly noted it can even decide in the middle of a response to start doing reasoning.
- Meta Teases Llama 4 Release!: Three new models, cybele, themis, and spider, are reported to behave as if optimized for elomaxxing on the arena, potentially indicating imminent Llama 4 release candidates.
- The buzz is that Meta will release before their official event, echoing Llama 3's drop on April 18th, to avoid being eclipsed in model performance.
- Cracking the OpenAI Code: Multiscale Diffusion?: Analyzing OpenAI image generation frames reveals a multiscale structure, with evidence favoring interleaved latent autoregression over a Laplacian pyramid, decoded via non-causal diffusion across scales, according to this tweet.
- The raster scan in OpenAI's image generation is seemingly UI, with each frame reflecting global updates via coarse-to-fine multi-scale diffusion, rather than patch-wise AR.
GPU MODE Discord
- Ampere GPU threads defying expectations: A member calculated an Nvidia Ampere GPU with 96 SMs should theoretically support 12288 threads, but observed performance improvements up to 24576 threads.
- The member is analyzing Geohot's GPU Noob kernel to understand thread performance and questioned if kernel latency hiding could allow twice the cores to be scheduled concurrently on each SM.
- Triton's Emulated Dot Scaled Scaling back Performance: A user reported that using Triton's emulated
dot_scaledfunction on H100 with default behavior of upcasting tobf16hurts performance, consulting the Triton documentation for reference.- Another user inquired about loading an entire matrix into L1 cache and processing it on a single SM in Triton, and whether subsequent
tl.loadcalls on the same matrix would retrieve from L1 cache rather than HBM.
- Another user inquired about loading an entire matrix into L1 cache and processing it on a single SM in Triton, and whether subsequent
- PTX Compiler orchestrates Memory Access: A member expressed confusion regarding memory access patterns in FlashAttention, specifically about the necessity of reshaping data for 128-bit memory transfers, referencing section 5.3 of the CUDA C Programming Guide.
- Another member clarified that the PTX compiler manages the data layout in registers to ensure that a thread can write 128 bits of contiguous data to a single aligned gmem address with one instruction, recommending Nsight Systems (nsys) and Nsight Compute (ncu) to profile.
- BFloat16 Breaks RoPE says research: A new paper (When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training) identifies that BFloat16 introduces numerical errors in RoPE, compromising its relative encoding, even when computed in Float32.
- The paper introduces AnchorAttention, a plug-and-play method that improves long-context performance, reduces training time by over 50%, and preserves the model's general capabilities, with code supporting FlashAttention and FlexAttention available on GitHub.
- Apple Silicon Memory Map a mystery: A member inquired about the on-chip caches and memory hierarchy in Apple Silicon M-Series GPUs, seeking the Apple equivalent to an NVIDIA A100 memory map and linked a paper on Apple M-Series SoCs.
- The discussion highlighted that Apple does not publicly reveal certain GPU details like NVIDIA, making it difficult to ascertain specific cache numbers, but the paper mentioned L1 caches (192 KB per core) and shared L2 caches up to 24 MB in the M4 chip.
Interconnects (Nathan Lambert) Discord
- Shear Extends Alignment Expertise with Softmax: Emmett Shear, Adam Goldstein, and David Bloomin have launched Softmax, a 10-person startup focused on organic alignment, aiming to fuse human and AI goals, as detailed in a Core Memory article.
- The startup is based in San Francisco and draws inspiration from nature and intelligent systems to achieve its alignment goals.
- Musk Merges xAI with X: Elon Musk announced that xAI is merging with X to integrate AI capabilities and expertise with X's reach, detailed by The Verge.
- The merger aims to leverage X's extensive platform to enhance and deploy xAI's advanced AI technologies.
- GPT-4o's Image Generation is Frontend Trickery?: A user discovered that GPT-4o's line-by-line image generation is a browser-side animation, with the server sending only 5 intermediate images at a patch size of 8, according to this tweet.
- This frontend illusion creates the effect of gradual image creation without the computational cost of generating each line individually.
- Gemini 2.5 Pro: Now Playing for Everyone: Gemini 2.5 Pro (experimental) is now available to all Gemini users due to TPUs running hot, as announced on GeminiApp's Twitter.
- The expanded access allows more users to test the model, though free users have rate limits.
- MiniMax Turns Text to Speech with Audio Speech-02: MiniMax AI launched Speech-02, which turns any file or URL into lifelike audio instantly in 30+ languages with native flair, unlimited voice cloning, and sub-second streaming, as detailed on MiniMax's Twitter.
- The model supports up to 200k characters in a single input, making it suitable for creating audiobooks and podcasts.
Modular (Mojo 🔥) Discord
- Lattner's Legacy: From LLVM to Modular AI: Chris Lattner shared a list of his published work, highlighting his contributions to LLVM, Clang, Swift, MLIR, and CIRCT, alongside his role at Modular AI.
- His leadership extends to the LLVM Foundation, where he serves as a board member, further solidifying his impact on modern compiler technology.
- Mojo REPL Faces Deprecation: A Modular forum discussion link highlights the deprecation of the Mojo REPL, signaling a shift in the language's development environment.
- Notebooks are being championed by members like Jeremy Howard for not only experimentation but also packaging with Mojo.
- Mojo Lists Hit Trait Object Segfault: Users encountered a segmentation fault (issue #4218) when creating a
Listof trait objects, likeList[Estimator], due to incomplete trait support.- A suggested workaround involves using
List[Variant[KNN, SVM]]with type checking viaisato call methods, enabling a form of heterogeneous list management.
- A suggested workaround involves using
defvsfn: Mojo Syntax Showdown: A debate arose overdefversusfnin Mojo, questioning iffnshould be the default due to its type safety and typed Python workflows via Mypy.- While some see
defas beginner-friendly, a feature request suggests makingdefdefault to returning None to bridge the gap between Mojo and Python syntax.
- While some see
- DeepSeek Ditches CUDA for PTX Layer: Members pointed out that DeepSeek's breakthrough was achieved by bypassing CUDA and directly accessing the PTX layer, a lower-level assembly-like programming interface.
- One member also stated that the NVIDIA driver isn't counted as cuda and that NVIDIA is a bit all over the place and inconsistent in their terminology over time.
Notebook LM Discord
- NotebookLM Demands Video Snippets: Users are requesting NotebookLM to include video snippets in its responses when a video is used as a source to provide visuals, and the team will enable multi-modal output in the future.
- Users want timestamps so they can skip through and relisten to specific sections like Audible.
- Mind Map Exports Remain Elusive: A user inquired about exporting Mind Maps in DOT format or publishing an interactive applet with the Google UI for NotebookLM.
- Unfortunately, this functionality is not currently available.
- Android Sharing System Integration Sought: Users are eager for NotebookLM to participate in the Android sharing system, ideally through a dedicated app.
- The suggestion involves the ability to automatically search inside a default notebook when choosing NotebookLM from the share menu.
- AI Voices Stumble on Pronunciation: A user is trying to improve how AI voices pronounce words in NotebookLM, especially with company names with unique spellings.
- The user is hoping that feeding the AI with another source with the correct pronunciation gets the audio overview to pronounce company names correctly.
- NotebookLM Plus Hits Mysterious Limits: A NotebookLM Plus subscriber encountered a 'You've reached your daily chat limits' message, hindering their usage, even after troubleshooting.
- Other users clarified that Plus users shouldn't face any limits.
LlamaIndex Discord
- LlamaIndex + SkySQL Launch AI Agents: LlamaIndex teams up with SkySQL to show how to build AI agent systems for reliable text-to-SQL conversion without code, per their announcement.
- LlamaIndex now integrates with OpenAI Responses API enabling complex multi-agent workflows.
- Telemetry Attributes Get Tagged: A member sought ways to pass custom telemetry attributes when using LlamaIndex, specifically to attach a user ID to events.
- A solution using OpenTelemetry and a Colab notebook example was shared, along with Arize's documentation.
- Multi-Modal OpenAI Agents Debut: Members discussed passing images as chat messages to
OpenAIAgent, with one suggesting the use of OpenAI's multi-modal capabilities.- Another recommended building an agent from scratch with workflows, or modifying
chatmemorybufferto add images to the request.
- Another recommended building an agent from scratch with workflows, or modifying
- Internet of Agents Proposed: A member shared an article on constructing an Internet of Agents to solve interop problems in agentic AI, and can be found at [IoA].
- The article suggests that open standards could unlock composability across ecosystems, including LlamaIndex.
tinygrad (George Hotz) Discord
- E-Waste Rig vs Tinygrad Box: A user questioned the value of a repurposed e-waste inference machine with 4x 4090s (linked here) when compared to the Tinygrad Box.
- Concerns were raised about potential PCIe errors due to the machine's homebrew motherboard, estimating its value at $1,000 + the cost of the 4090s.
- Finite Field Assembly: CUDA Alternative Surfaces: A user shared Finite Field Assembly, a CUDA alternative designed for computations over finite fields, extending C89 and supporting recursive computing.
- It leverages the properties of prime numbers to multiply several array elements concurrently, for example in matrix multiplication.
- TinyGrad Internals Exposed!: A user shared their comprehensive notes on TinyGrad internals available here, covering UOps, ShapeTracker, and the Pattern Matcher, drawing inspiration from mesozoic-egg.
- These notes complement the official TinyGrad documentation with a deep dive into the architecture.
- ORT CPUExecutionProvider Silently Casts Float16!: A user reported that the ORT CPUExecutionProvider silently casts inputs into float32 for float16 models, runs computations with float32, and casts the output back into float16, which is blocking numpy removal.
- The user suggested adding an envvar to replicate this behavior in their ONNX setup for testing and debugging purposes.
- VAE tinygraining takes off!: A member has been experimenting with building a VAE with tinygrad and has successfully modified Huggingface's Diffusers library to work with tinygrad.
- The VAE used in Stable Diffusion is now functional, with the code available here.
Torchtune Discord
- FP8 Training Recipes Explored: Most FP8 training recipes are actually FP8 QAT, unless you can only train on GPUs without FP8 support (e.g. A100), in which case you can train with FP8 directly.
- Attend a Torchtune office hours next Friday, with a Discord link for details.
- Discord Time Zones Finally Click: Members discussed the automatic conversion of time zones within Discord for events.
- One member shared a brain meme GIF in response to successfully converting time zones on the fly.
- Code Review Team asked to Step on the Gas: A member requested a final review for PR #2441 to expedite the merge process, as all checks have already passed.
- Another member was pinged to review the PR.
- GRPO Teaches Search on the Internet: A paper on GRPO to teach searching on the internet was shared arxiv.org/pdf/2503.09516.
- Details of the project were not otherwise revealed.
Cohere Discord
- Command-R Boasts Speedy Performance: The Command-R model is confirmed as the fastest and most versatile model, using Command-A by default, but model changes are not supported in the playground.
- Users were directed to use the API to try out different models.
- Aya-Vision Image Uploads Glitch: Users reported errors when uploading images to the playground using Aya-Vision, and on the Aya Vision demo on Hugging Face it sometimes takes over 30 seconds to respond.
- A Cohere staff member responded that they will investigate the latency on their end.
- Docs Typo Causes Bad Request: A user reported a typo in Cohere's documentation where
train_epoch=1should betrain_epochs=1, causing aBadRequestError.- A Cohere staff member confirmed the typo and pushed a fix.
- Indy Game Dev Turns to Cohere: A self-taught indy game developer working mainly in C++ with graphics and audio libraries introduced themselves, mentioning they are currently working on a browser game for their friend's web animation series.
- This developer has started using Cohere as an alternative to the other big names.
Nomic.ai (GPT4All) Discord
- Libre Wolf Faces Security Scrutiny: Members discussed the security of Libre Wolf compared to Firefox, questioning its advantages.
- The conversation did not provide a definitive answer, but highlighted the importance of browser security considerations.
- GPT4All Model Search Stumbles: A user reported difficulty searching GPT4All models, noting the absence of a built-in search feature.
- A member clarified that local model list search hasn't been a GPT4All feature for 2 years, and provided links to the model lists on GitHub.
- Documentation Ingestion Model Assistance: A member requested advice on a model capable of ingesting documents and answering questions.
- Another member shared the GPT4All wiki with official translations and suggested using Google Translate for other languages.
- Llama3 8B Instruct Tested for Blogging: A user inquired about the suitability of Llama3 8B Instruct for creating blog posts and webpages from video courses.
- The discussion prompted a question about the difference between .bin and .gguf files and their interchangeability, but did not provide a definitive answer about suitability for blogging.
DSPy Discord
- Pydantic's
conintTriggers Validations: Theconintfeature in Pydantic sets constraints, such asconint(ge=1, le=10), but throws a ValidationError if the output falls outside the specified range.- A member requested DSPy to dynamically generate examples and resend requests upon validation failures, but this is currently not functioning as expected.
- RateLimitErrors Bug MIPROv2 Users: Users reported frequent RateLimitErrors despite setting
num_threads=1when using MIPROv2 withgpt-4o-minion Azure OpenAI, due to MIPROv2.compile()** making multiple internal API calls.- It's suggested to add retry logic with a sleep(30) interval, lower
max_*_demos, and upgrade to the latest DSPy version with built-in rate throttling.
- It's suggested to add retry logic with a sleep(30) interval, lower
- Rate Limit Workarounds Hamper Optimization: A user finds that reducing
max_bootstrapped_demosandmax_labeled_demosto circumvent RateLimitErrors hurts optimization.- They suggest DSPy should have a better internal mechanism to manage API call frequency, since structured prompting in MIPROv2 and Copro can lead to errors if the LLM returns empty outputs due to API truncation or rate limits.
- Signatures as a,b -> c: In DSPy, the signature is defined as "a, b -> c", where a, b, and c are meaningful names.
- The optimizer then generates prompts and runs them on a dataset to determine the best performing prompt.
LLM Agents (Berkeley MOOC) Discord
- DeepMind Engineer to Present AlphaProof Lecture: Thomas Hubert, a research engineer at Google DeepMind, will present "AlphaProof: when reinforcement learning meets formal mathematics" on 3/31 at 10AM PDT, livestreamed on YouTube.
- The lecture will explore how computers contribute to grand problems like the Birch and Swinnerton-Dyer conjecture, with Hubert holding an MS in Mathematics from Stanford University.
- MOOC Lecture Times Adjusted: The LLM Agents MOOC lecture today was moved to 10 AM PST to accommodate the speaker from the UK.
- The course website (llmagents-learning.org/sp25) and Discord server provide essential links and discussion forums for the LLM Agents MOOC.
- Lecture Recordings Available: Recordings from prior LLM Agents MOOC lectures can be found on the course website and in this YouTube playlist.
- Quizzes for the course are completion based, meaning the score does not matter as long as they are attempted.
- AgentX Credits Offered: AgentX offers credit resources, and details can be found on the AgentX website).
- A collection form for those wanting credits for AgentX is releasing this week.
MLOps @Chipro Discord
- TMLS 2025 kicks off Call for Speakers: The Call for Speakers has opened for the Toronto Machine Learning Summit (TMLS) in June 2025.
- TMLS 2025 boasts 16 specialized tracks, including Advanced RAG, Multimodal LLMs, AI Agents in Production, MLOps for Smaller Teams, Responsible AI Implementation, and GenAI Deployments.
- MLOps focuses on Smaller Teams: The Toronto Machine Learning Summit will feature an MLOps track specifically designed for smaller teams.
- This track provides a platform for these teams to exchange experiences and gain insights from others in the field of MLOps.
The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
PART 2: Detailed by-Channel summaries and links
The full channel by channel breakdowns have been truncated for email.
If you want the full breakdown, please visit the web version of this email: !
If you enjoyed AInews, please share with a friend! Thanks in advance!