AI News (MOVED TO news.smol.ai!)

Archives
October 30, 2024

[AINews] GitHub Copilot Strikes Back

This is AI News! an MVP of a service that goes thru all AI discords/Twitters/reddits and summarizes what people are talking about, so that you can keep up without the fatigue. Signing up here opts you in to the real thing when we launch it 🔜


GitHub may be all you need for AI-Native coding.

AI News for 10/28/2024-10/29/2024. We checked 7 subreddits, 433 Twitters and 32 Discords (231 channels, and 2681 messages) for you. Estimated reading time saved (at 200wpm): 279 minutes. You can now tag @smol_ai for AINews discussions!

GitHub's tenth annual Universe conference was today:

image

And it brought a raft of notable announcements (full blogpost here): mostly being GitHub's takes on popular code AI tools.

  1. Multi-model Copilot: adding Anthropic’s Claude 3.5 Sonnet, Google’s Gemini 1.5 Pro, and OpenAI’s o1-preview in a new model picker UI. Copilot's base model has moved from Codex, GPT3.5, GPT4, 4o, and 4o-mini, but for the first time developers get to choose models from other companies including Google. This was big enough to reach mainstream media today and one can't help but tie this story with reports of a "fraying" Microsoft-OpenAI partnership.

image

Cassidy Williams also demoed the new multi-file editing capability of Copilot and a custom instructions file - analogous to the Composer and .cursorrules features of Cursor.

  1. GitHub Spark: "the AI-native tool to build applications entirely in natural language. Sparks are fully functional micro apps that can integrate AI features and external data sources without requiring any management of cloud resources." basically their v0 and bolt.new and Claude Artifacts competitor, complete with deployment-free hosting, themable design system, persistent data storage, and integrated model prompting.

"Utilizing a creativity feedback loop, users start with an initial prompt, see live previews of their app as it’s built, easily see options for each of their requests, and automatically save versions of each iteration so they can compare versions as they go."

image

The presenters also discussed the latest GitHub Models (now off waitlist), and last year's big launch, Copilot Workspace and Code Reviews (now with 2 more agents: Brainstorm and Build/Repair, to the existing 3 Spec/Plan/Implement agents, with a new VSCode extension) and security Autofix updates.


[This issue brought to you by Weights & Biases]!: Your LLMs aren't just text-only anymore - so why should your observability be?

Weave from Weights & Biases now supports audio tracing alongside text, images and other modalities. With just 3 lines of code, track every input, output and metadata across your multimodal AI stack.

Try it yourself in our interactive Colab notebook!

swyx commentary: this notebook looks short, but IMO the gold is the 19 cells hidden under "Advanced Usage: Realtime Audio API with Weave"! You wouldn't expect a normal LLM Ops product to have updated to support the OpenAI Realtime API so soon, but it looks like the WandB team have been cooking.


The Table of Contents and Channel Summaries have been moved to the web version of this email: !


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

AI Development and Industry Trends

  • Tinygrad Optimization: @jxmnop noted that tinygrad is focusing on reducing lines of code compared to PyTorch, resulting in a codebase that's growing horizontally and becoming borderline unreadable to humans.
  • AI Model Capabilities: @fchollet pointed out that the current low adoption rate of GenAI indicates potential for growth, contrary to claims of 40% adoption. @rohanpaul_ai highlighted Gemini Flash-8B's strong price-performance ratio, with $0.0375 per million input tokens and $0.15 per million output tokens.
  • AI Infrastructure: @rohanpaul_ai shared details about xAI's Colossus supercomputer, featuring 100,000 NVIDIA Hopper GPUs and plans to double to 200,000. The system uses NVIDIA Spectrum-X Ethernet platform, supporting 800Gb/s port speeds.

AI Applications and Tools

  • Perplexity Spaces Update: @perplexity_ai announced improvements including 5 file uploads for free users, enhanced custom instructions, detailed Space overview cards, and support for Markdown files.
  • RAG Developments: @togethercompute shared an open-source implementation of Contextual RAG using Llama models, involving context generation, hybrid search, and reranking. @llama_index introduced advanced RAG systems using MLflow and LlamaIndex Workflows for flexible orchestration and evaluation.
  • AI Agents: @omarsar0 launched a course on AI Agents, covering fundamentals and practical tips for building agentic AI systems. @LangChainAI shared a comprehensive repository for agent development using LangGraph.

AI Research and Model Updates

  • Model Comparisons: @ajayj_ reported that Genmo Mochi 1, an open-source video generation model, outperforms Runway, Kling, Luma, and Pika models according to community votes.
  • Optimization Techniques: @giffmana highlighted the effectiveness of sigmoid loss with bias in improving model performance.
  • Context Window Expansion: @rohanpaul_ai mentioned ongoing work on 100mn context window LLMs and research on 1-Bn context windows, potentially impacting the future of RAG.

AI Ethics and Societal Impact

  • AI Adoption Concerns: @ylecun criticized the superiority complex of some tech leaders, warning against treating followers as "low IQ" and expecting blind submission.
  • AI Productivity Impact: @random_walker shared skepticism about claims of significant productivity boosts from AI, noting only a 1% increase despite 3% usage.
  • AI in Education: @svpino cautioned against overestimating AI's capabilities in building SaaS businesses, emphasizing that AI is a tool, not a complete solution.

AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Optimizing LLM Inference on Consumer Hardware

  • What's the best way to run llama on a local GPU (low-end RTX3000)? Interested in both calling it from within Python as well as a GUI. The space evolves so quickly, so I'd love an up-to-date recommendation! Thanks (Score: 39, Comments: 23): For running Llama models on low-end RTX 3000 GPUs, current recommendations include using llama.cpp or text-generation-webui for a GUI interface, and transformers library with bitsandbytes for Python integration. These methods allow for efficient quantization and inference on consumer-grade hardware, though specific performance may vary based on model size and available VRAM.
    • Ollama with Open webui is recommended, with some users running it via Docker container and making HTTP calls for integration. The author suggests Harbor for a comprehensive LLM stack using Docker.
    • Users employ various interfaces: mikupad for writing with llama.cpp, TabbyAPI with LLama-3.1 or 3.0 models integrated into silly tavern, and Lm studio or Aya for GUI and OpenAI API compatibility.
    • Some prefer custom setups, like running llama.cpp in scripts for pure writing, emphasizing the importance of alternative token selection which may be absent in other UI options.
  • AI scores of mobile SoCs by brand, year and segment (Score: 43, Comments: 6): The post analyzes AI performance benchmarks of mobile SoCs from ai-benchmark.com, revealing significant performance gaps between flagship and high-end segments. Notable findings include the Snapdragon 7+ series outperforming its branding, Dimensity's substantial AI performance increase in recent generations, and the four-year-old Snapdragon 8 Gen 1 still surpassing newer Snapdragon 7 series, 8s Gen3, and most Dimensity processors, with the A17 Pro scoring 3428, just below the Snapdragon 8 Gen 3.
    • Users discussed running large language models on phones, with interest in models like 16B deepseek v2 Lite MoE and Llama 3.1 8b. The ZTE Z60 Ultra with up to 24GB RAM was mentioned as capable of running 12B models.
    • Debate arose over the relevance of the benchmark's tested models, with some arguing that TFLOPS, TOPS, and memory bandwidth specs are more informative for real-world AI applications on phones than scores based on models like Inception V3.
    • Interest was expressed in the state of Mediatek chipsets for AI tasks, particularly regarding GPU and NPU functionality. The post highlighted Dimensity's recent improvements in AI performance.
  • Updated with corrected settings for Llama.cpp. Battle of the Inference Engines. Llama.cpp vs MLC LLM vs vLLM. Tests for both Single RTX 3090 and 4 RTX 3090's. (Score: 75, Comments: 51): Llama.cpp, MLC LLM, and vLLM were benchmarked for LLM inference on consumer GPUs, specifically testing with a single RTX 3090 and four RTX 3090s. The post provides updated results with corrected settings for Llama.cpp, comparing the performance of these three inference engines across different GPU configurations.
    • Llama.cpp performance improved significantly after correcting settings, reaching 50-51 tokens/second for single GPU tests and 15 tokens/second for 4x GPU tests. The community suggested adding exllama to future benchmarks and exploring quantized model comparisons.
    • A blog post was shared, detailing benchmarks for multiGPU scaling, concurrent requests, and speculative decoding. Users expressed interest in how MLC-LLM scales across 1-4 GPUs, with one user reporting 25 tokens/second on 1 GPU and 34 tokens/second on 2 GPUs using MI60 cards.
    • Discussions focused on PCIE bandwidth usage, with tests showing surprisingly low utilization (0.1 MB/s) during tensor parallel inference. Users also debated the choice of FP16 for benchmarks, with some preferring Q4 or Q8 quantization for practical use cases.

Theme 2. Advancements in Open-Source LLMs for Creative and Uncensored Use Cases

  • Three Llama 3.2 Models enhanced, at 7B each for creative uses - uncensored. (Score: 44, Comments: 19): Three enhanced Llama 3.2 7B models have been released for creative and uncensored use, each expanded to 67 layers and 606 tensors. The models, available on Hugging Face, are rated on a "de-censor" scale from 1-10 and feature improved instruction following, nuance, emotion, and prose depth, with censorship and bias controllable via prompts.
    • Frankenstein models are criticized as often being "lobotomized" and underperforming, with users suggesting using full-size models with adjusted settings instead. The model creator defends his approach, citing 45 examples of improvements and explaining his unique methods for building and testing models.
    • User export_tank_harmful praises the creator's work, particularly mentioning MN-Dark-Planet-TITAN-12B and L3-Dark-Planet-8B models. They suggest including the creator's Hugging Face name in Reddit posts for credibility and express support for continued abliteration efforts.
    • Discussion on model availability for ARM devices, with the creator clarifying that ARM-optimized models have filenames ending in Q4_0_4_8.gguf. Currently, only 3 versions are supported by llamacpp for ARM optimization.
  • LLM Recommendation for Erotic Roleplay (Score: 48, Comments: 61): The post seeks recommendations for Large Language Models (LLMs) specifically for erotic roleplay, listing several options with a focus on DarkForest V2 and backyardai/Midnight-Rose-70B-v2.0.3-GGUF as top contenders. The author also mentions other models like Stheno, Lyra 12B V4, TheSpice-8b, and various others ranging from 8B to 72B parameters, but considers them potentially weaker for this specific use case.
    • ArsNeph recommends newer models, highlighting L3 Stheno 3.2 8B, Magnum V4, UnslopNemo 12B, Mistral Small 22B and its finetunes like Cydonia. For larger models, they suggest Midnight Miqu 1.5 70B, Euryale 2.1 70B, and New Dawn Llama.
    • Several users endorse Midnight Rose and Midnight Miqu as top choices for erotic roleplay. TheLocalDrummer mentions that some users prefer Behemoth v1.1 over Midnight Miqu, while others recommend trying NemoMix-Unleashed-12B and EVA-Qwen2.5-72B-v0.0.
    • Users suggest exploring Gemma-2-27B despite its censorship, and **Mistral-Small-22B-ArliA

Theme 3. Innovations in LLM Tooling and Infrastructure

  • We just Open Sourced Promptwright: Generate large synthetic datasets using a local LLM (Score: 63, Comments: 12): Promptwright, an open-source Python library for generating synthetic datasets using local LLMs via Ollama, has been released. It offers a simple interface for dataset generation, configurable instructions and system prompts, JSONL output format, and direct integration with Hugging Face Hub, allowing users to process thousands of samples locally without API costs or rate limits while maintaining data privacy.
  • Mistral.rs v0.3.2 gets a 26% Metal performance boost and PyPI wheels! (Score: 62, Comments: 16): Mistral.rs v0.3.2 introduces simplified installation via PyPI wheels for various platforms (Metal, CUDA, Apple Accelerate, Intel MKL, and plain CPU) and achieves a 26% performance boost for Metal decoding through optimized MLX attention kernels. The update also includes CUDA improvements with a Marlin GPTQ kernel and FP8 quantization, along with support for models like Llama 3.2 Vision, with links provided to the GitHub repository, Python package documentation, and a UQFF model collection for prequantized models.
  • Retrieval system extending any off-the-shelf LLM to 1B (billion) context on a standard CPU during inference time: (Score: 63, Comments: 6): A new retrieval system has been developed that can extend the context length of any off-the-shelf Large Language Model (LLM) to 1 billion tokens during inference time, using only standard CPUs. This system, detailed in a Zyphra blog post and an arXiv paper, significantly expands the capability of LLMs to process and understand vast amounts of information without requiring specialized hardware.
    • The title's claim of "1B context length" is criticized as clickbait, with users noting it refers to tokens in a vector store, not actual inference. Inference on 1M context for an 8B model would take ~3000s on an A100 GPU.
    • Users humorously extend the concept, suggesting even larger context lengths like 100B tokens or 100 Petabytes (referencing Google's index size) to highlight the arbitrary nature of such claims.
    • There's interest in benchmarks beyond hash chain retrieval and potential applications, such as creating small LMs (e.g., 1B models) that load necessary knowledge via RAG, potentially outputting thousands of tokens per second.

Theme 4. Challenges in AI Document Understanding and Real-World Applications

  • How I used vision models to help me win at Age Of Empires 2. (Score: 327, Comments: 51): The author developed WololoGPT, an AI-based coach for Age of Empires 2, using vision models and LLMs to provide real-time gameplay advice, including resource management and enemy countering strategies. The project, built using Claude 3.5 and Gemini Flash for vision processing, is open-source and available on GitHub, with a video demo and downloadable executable on the official website.
    • Echo9Zulu- suggests developing a system for recording application data, viewing WololoGPT as an opportunity to build valuable training data on model interpretations of game events. They recommend studying model behavior using AoE2 as a template, particularly focusing on how models handle fog of war effects on strategy.
    • The project is praised for its potential to push the state-of-the-art in vision model applications, with the commenter noting that current literature is limited on such use cases. They recommend passively recording data to capitalize on this opportunity.
    • WololoGPT is described as a "cool build" that provides a gameplay boost without feeling like complete cheating. The developer confirms it has improved their gameplay, describing it as a "little boost."
  • Document understanding is very very hard: an illustration (Score: 34, Comments: 26): The post illustrates document understanding difficulties for LLMs using a San Francisco pool schedule example. The author challenges readers to extract recurring lap swim periods from a single-page flyer, with bonus tasks including generating an ical (ics) format and handling holidays, noting that models often miss Mondays and misinterpret Wednesday lap swim times. Despite some impressive capabilities, the author concludes that even advanced LLMs struggle with tasks a six-year-old can accomplish, cautioning against premature deployment of document understanding in production environments.
    • Users critiqued the pool schedule layout, noting its poor design and inconsistencies. One commenter highlighted that such goofy layouts are common in professional settings, citing M&A Diligence Checklists as an example.
    • A user successfully extracted the schedule using Chat 4.0 and created a Python script to generate an ical file in 5 minutes. The script handles recurring events but doesn't account for holidays.
    • Gemini 1.5 Pro in Aistudio correctly extracted most of the schedule, including the tricky Wednesday lap swim times, but added an non-existent Sunday evening slot. Users discussed multi-step reasoning and the challenges of handling different image resolutions with vision models.

Other AI Subreddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Model Releases and Capabilities

  • Updated Phi-3 Mini with function calling: Rubra AI released an updated Phi-3 Mini model with function calling capabilities, competitive with Mistral-7b v3 (/r/LocalLLaMA).
  • OpenAI's o1 reasoning model: OpenAI CFO Sarah Friar says lawyers report the new o1 reasoning model can do the work of a $2000/hour paralegal (/r/singularity).

AI Applications and Demonstrations

  • AI-assisted multi-arm robot for apple picking: A video demonstrates an AI-assisted multi-arm robot that can identify and pick ripe apples (/r/singularity).
  • Realistic facial animation using Stable Diffusion: A developer is working on a realistic facial animation system for Meta Quest using Stable Diffusion, running at 90fps on the Quest 3 (/r/StableDiffusion).
  • Robot hand with tactile sensing: Robot Era introduced its first-generation XHAND, featuring 12 degrees of freedom and tactile sensing in each finger (/r/singularity).
  • Robots performing beauty services: A video shows robots doing nails and eyelashes in LA, demonstrating automation in previously human-dominated services (/r/singularity).

AI Policy and Infrastructure

  • US government push for AI infrastructure: National Security Advisor Jake Sullivan states the US needs to build 10s or even 100s of gigawatts of energy infrastructure to power AI data centers or risk falling behind competitors (/r/singularity).

AI Impact and Societal Discussion

  • Discussions on AI's impact on employment: Multiple posts discuss the potential impact of AI on jobs, including comparisons to the decrease in horse populations after the invention of cars (/r/singularity).
  • Public perception of AI advancements: A post discusses how people react to being shown ChatGPT, with some becoming very interested and others remaining unimpressed (/r/singularity).

Memes and Humor

  • A post humorously suggests using Stable Diffusion to create disinformation about the history of bathtubs (/r/StableDiffusion).

AI Discord Recap

A summary of Summaries of Summaries by O1-preview

Theme 1: AI Model Releases Shake Up the Scene

  • Stable Diffusion 3.5: Big Power in Medium Size!: Stability.ai unleashes Stable Diffusion 3.5 Medium, a 2.5 billion parameter model that runs on just 9.9 GB of VRAM, democratizing high-quality image generation.
  • Moondream Bets Small Models Can Pack a Punch: Moondream raises $4.5 million to prove that smaller AI models are just as effective, shifting the industry's focus from gigantic architectures.
  • GitHub Copilot Supercharges with Claude and Gemini: GitHub's Copilot integrates Claude 3.5 Sonnet and Google's Gemini 1.5 Pro, giving developers an AI power-up.

Theme 2: AI Tooling Gets a Turbo Boost

  • Unsloth Slays Complexity with Gradio UI: An innovator rolls out a Gradio app that simplifies model training with Unsloth, making AI accessible even for no-code enthusiasts.
  • ThunderKittens Roar with Lightning Speed: The much-awaited ThunderKittens 0.000000002 drops, boasting 6-14x faster linear attentions and outpacing FA3 in attention backward passes.
  • Developers Tinker Triton Kernels for Speed: Engineers discuss optimizing Triton kernels, finding that multiple kernels outperform single ones, and uncover challenges with BF16 casts.

Theme 3: AI Privacy and Security Take Center Stage

  • PAPILLON Flutters In to Protect Privacy: Researchers debut PAPILLON, hitting 85.5% quality with just 7.5% privacy leaks, blending local and cloud LLMs securely.
  • ChatGPT Typo Tantrums Baffle Users: ChatGPT starts spewing typos and gibberish, leaving users scratching their heads about the sudden drop in output quality.
  • Apple Throws Down $1M Hackathon Gauntlet: Apple dares hackers to breach their AI servers with a whopping $1 million bounty, sparking debates on AI security.

Theme 4: AI Jobs Abound on New Platforms

  • Cracked Engineers Breaks Ground in Tech Hiring: The freshly launched Cracked Engineers connects AI talent with top startups, already partnering with Weaviate, UnslothAI, and more.
  • AI Startups on the Hunt for Top Talent: Companies like Unsloth AI, Julius AI, and Jimini AI are actively recruiting, offering amazing opportunities for those ready to dive into cutting-edge AI.
  • Job Seekers Rejoice: Tailored Newsletters Incoming: Cracked Engineers announces a weekly tech jobs newsletter, letting subscribers tailor content with tags like CUDA, MLOps, and Software Engineering.

Theme 5: AI Community Buzzes with Events and Insights

  • LLM Agents Hackathon Hits the Ground Running: With over 1,000 innovators registered, the LLM Agents Hackathon dangles over $200K in prizes across five thrilling tracks.
  • OpenAI's CFO Says "AI Isn't Experimental Anymore!": In a candid interview, OpenAI CFO Sarah Friar proclaims that AI has gone mainstream, infiltrating banks and fintech daily.
  • Meta Sets Sights on Its Own AI Search Engine: Meta's web crawlers hint at a new AI-powered search engine, aiming to cut ties with Google and Microsoft.

PART 1: High level Discord summaries

HuggingFace Discord

  • Clem Introduces Himself to the Community: Clem, co-founder and CEO at Hugging Face, expressed excitement about using Discord to connect with community members actively. I can't wait to interact with all of you, emphasizing a strong intent to engage.

    • He also promoted an upcoming live workshop, encouraging members to share ideas on expanding its visibility and participation via this link.
    • Frustrations with TensorFlow: Many members aired frustrations with TensorFlow, citing disabling GPU support on Windows and complex documentation issues, often preferring to transition to PyTorch for faster developments.
    • Shared experiences reflected a common sentiment of dissatisfaction with TensorFlow’s bugs and lack of support within the community.
    • Exploration of Hemp Nanosheets: Hemp-derived carbon nanosheets show potential as cost-effective alternatives to graphene in energy storage, with feasibility established at $500 per ton by Dr. David Mitlin's research.
    • This sparked discussions on military and aerospace applications, indicating a growing interest in alternative materials suitable for high-tech industries.
    • Swin Transformer v2 Discussion: Members explored using Swin Transformer v2 for handling image-like data cubes, with discussions on adapting architecture for unique input shapes.
    • One user mentioned utilizing data cubes instead of traditional images, prompting conversations about necessary architectural adjustments.
    • LangChain SQL Agent Resource Sharing: A GitHub notebook detailing a LLaMA2 SQL chat was shared as a resource for developing context-aware reasoning applications with LangChain SQL Agent.
    • This resource is positioned to assist users in enhancing their implementations, illustrating the community's focus on utilizing modern technologies for NLP tasks.


Unsloth AI (Daniel Han) Discord

  • Gradio UI Tool Simplifies Model Training: A user created a Gradio app that streamlines the training of models with Unsloth, making it easier to adjust settings and upload models to Hugging Face.

    • This enhancement aims to assist nocode users, significantly improving the accessibility of AI model training.
    • AI Job Opportunities from Unsloth: Unsloth is spotlighting a hiring campaign through Cracked Engineers, aiming to attract tech talent in AI fields.
    • Community members are encouraged to explore job listings on the platform while utilizing it for job tracking.
    • FP8 Fine-Tuning for Enhanced Training Speed: There's ongoing discussion about the adoption of FP8 for training within Unsloth, suggesting potential speed improvements.
    • The community raised questions about its implementation specifics, particularly in relation to base weights and LoRA.
    • Frustrations with Educational Systems: Members discussed feelings of time wasted in school, with one expressing a desire to make a difference instead.
    • This sentiment resonated, as others reflected on how personal experiences shape educational perspectives.
    • Insights on Optimizer CPU Offload: Discussion centered on the potential of Optimizer CPU Offload to improve efficiency in low-bit training frameworks.
    • By shifting operations to the CPU, models can achieve faster training times and optimize resource use.


Stability.ai (Stable Diffusion) Discord

  • Stable Diffusion 3.5 Medium Model Launch: The Stable Diffusion 3.5 Medium model is available for free commercial use with 2.5 billion parameters, running on consumer hardware with just 9.9 GB of VRAM.

    • This launch aims to broaden access to AI by ensuring compatibility even with low-end devices, transforming the landscape for creators.
    • Image Quality Hits New Heights: Users confirmed that Stable Diffusion 3.5 Medium excels in generating images over 1MP, outperforming the 3.5 Large variant in prompt adherence and quality.
    • However, once images exceed 2MP, the model starts to struggle, indicating limits to its scaling capabilities.
    • GPU Price Wars Rumble On: Current market trends show 3090 GPUs priced similar to 7900 XTX, with used 3090s hovering around $690.
    • Discussions included comparisons of GPU performance for AI workloads versus gaming, emphasizing the shifting dynamics of hardware affordability.
    • Sana Autoencoder Mixed Reactions: The Sana autoencoder promises efficient training and compression but received mixed feedback on its image quality results.
    • Some users remain skeptical, indicating a need for further validation on models leveraging this technology.
    • Switching UIs for Enhanced User Experience: Users explored switching from A1111 to ComfyUI, with some experimenting with SwarmUI for a streamlined image generation process.
    • Conversations highlighted preferences for different interfaces and optimizing settings like steps and cfg to improve prompt adherence.


Nous Research AI Discord

  • AI Newsletters for Developers: A member highlighted the need for technical AI newsletters, moving away from consumer-focused hype and recommended SemiAnalysis for its GPU insights.

    • This reflects a desire for more substantive resources among engineers who seek serious discussions on AI.
    • Finetuning Hermes 3 for Roleplay Bots: A user explored whether finetuning Hermes 3 could enhance a roleplaying bot's mimicry of character.ai, while another suggested leveraging prompts for the same outcome.
    • This discussion underlines the community's interest in optimizing AI for complex character interactions.
    • Meta Releases Layer Skip Code: Meta launched Layer Skip to improve LLM efficiency, providing the inference code and fine-tuned checkpoints.
    • This release aims to spark new research into AI optimization methods and interpretability.
    • GitHub Copilot Expands Model Choices: Major updates for GitHub Copilot include the addition of Claude 3.5 Sonnet and Gemini 1.5 Pro, offering developers broader model selections.
    • This shift may empower Anthropic in the competitive landscape of AI.
    • Microsoft and OpenAI's Complicated Relationship: Conversations indicated that Microsoft is exploring alternatives to OpenAI due to fears over dependency and risks associated with AGI declarations.
    • Members emphasized the importance of diversifying AI partnerships for strategic stability.


Perplexity AI Discord

  • Join the Curators Program!: Perplexity Team is actively seeking its first cohort of Curators to contribute to the Discover feed, engaging millions of users. If you enjoy crafting Pinterest boards or editing Wikipedia pages, you can apply here.

    • Curators will be responsible for creating Pages that inspire and inform users directly within the Perplexity product.
    • Grok 2 now available for Pro users: Perplexity AI announced that Grok 2 is now accessible for Pro users, allowing them to set it as their default model in settings. Some users are curious if Grok 2 will remain uncensored, though its improvements seem limited.
    • The announcement stirred discussions, with skepticism about any significant advancements over previous iterations.
    • Merchandise Launch Announcement: Perplexity AI is launching a merchandise line called Perplexity Supply, with the first drop set for tomorrow at 9 AM Pacific Time. Their tagline emphasizes a brand 'made for the curious,' hinting at an engaged community.
    • Community excitement is palpable, as users anticipate collectibles and fashion items tied to the brand.
    • NASA Generates $76B for US Economy: A recent report claims that NASA has contributed approximately $76 billion to the U.S. economy, a reflection of its various projects and innovations. This emphasizes NASA's impact beyond space exploration, reinforcing its role in economic growth.
    • The data suggests significant returns on investment from public funds, making a compelling case for continued support.
    • Getting Smart on Advancements in Photonic Computing: Discussions highlighted advancements in photonic computing and its implications for the cybersecurity landscape. These technologies are predicted to transform how data is processed and secured.
    • Members shared fresh insights, indicating a growing interest in integrating photonic capabilities into existing frameworks.


Notebook LM Discord Discord

  • BYD Aims for Auto Industry Domination: A video discusses how BYD, a Chinese electric vehicle powerhouse, is poised to disrupt competitors like Tesla through aggressive global expansion and dealership openings, as highlighted in this video.

    • The discussion underscores BYD's innovative strategies intended to significantly impact the automotive market.
    • NotebookLM Enhances Staff Resource Accessibility: A user implemented NotebookLM as a staff resource guide, integrating an employee handbook and FAQs to streamline internal queries, but noted inconsistency in URL generation from external links.
    • This feedback suggests a need for further refinements in document integration within the platform.
    • Spanish Podcast Generation Faces Challenges: Users reported difficulties in generating Spanish podcasts with NotebookLM, having initially succeeded with two episodes, leading to calls for effective solutions.
    • Concerns were raised about underlying language processing issues affecting Spanish text generation, indicating a gap for necessary improvements.
    • Exploring Open Source Alternatives to NotebookLM: Community members are evaluating NotebookLlama, an open-source alternative that utilizes Meta's technology, but there's skepticism regarding the site's credibility as discussed in the Notebook Llama link.
    • Participants debated the benefits of open-source solutions, pointing to possible DNS issues and registration legitimacy.
    • Real-Time Avatars Revolutionize Podcasting: The integration of Simli for real-time avatars in podcasts has sparked interest, allowing for synchronized visuals using audio diarization to enhance viewer engagement.
    • This proof of concept underlines exciting potential for dynamic presentation styles in podcasts.


GPU MODE Discord

  • Unsloth Kernels enhance LLM fine-tuning: A member inquired about guides for unsloth kernels, which significantly upgrade LLM performance and memory efficiency, fine-tuning with Llama 3.2, Mistral, and others up to 2-5x faster with 80% less memory.

    • This has sparked interest in the community for practical implementations in high-performance LLM projects.
    • Triton Kernel insights and optimizations: Performance issues were discussed regarding Triton kernels, where a user noted that single kernel operations decreased speed compared to PyTorch, suggesting multiple kernels for efficiency.
    • Additional points were raised about the challenges related to BF16 operations not improving speed and ongoing issues with nightly builds in Triton.
    • H100 shows impressive speed improvements: A user reported achieving 255 tokens/sec with H100, using configurations such as reduce-overhead, further increasing to 300 tokens/sec with manual tweaks.
    • These techniques provide new frameworks for optimizing GPU utilization in LLM applications.
    • ThunderKittens 0.000000002 is here with enhancements: ThunderKittens 0.000000002 has been released featuring significant upgrades including 6-14x faster linear attentions and faster attention backwards than FA3.
    • A paper on kernel performance bottlenecks is also highlighted, questioning the real-world efficacy of custom kernels versus theoretical gains.
    • Cracked Engineers job platform gaining traction: Cracked Engineers launches, aimed at connecting talent with AI/tech startups, boasting current MRR of nearly $1000 pre-launch.
    • The platform offers an AI-assisted job posting process and a newsletter for tech roles, inviting community feedback for continual improvement.


LM Studio Discord

  • Token Processing Speeds Favor GPU: Members noted that token processing speeds are approximately 62 tok/sec on GPU and 7.5 tok/sec on CPU.

    • Fewill expressed enthusiasm by saying, 'nice!' while discussing these speeds.
    • Hunting for Local LLM Recommendations: A member sought recommendations for a locally running LLM similar to Phind or ChatGPT focusing on Python and Houdini SideFX.
    • Fabguy suggested researching HumanEval but noted that the niche nature of Houdini might affect response relevance.
    • NGINX Proxy Setup Woes: One user encountered difficulties configuring an NGINX proxy host for the LM Studio server despite activating serve on local network.
    • Others shared troubleshooting steps, underscoring the critical nature of accurate configuration settings.
    • PCIe Bandwidth Debate Ignites: Debate arose on whether PCIe bandwidth impacts inference performance, with suggestions that PCIe Gen 3 suffices since most processing occurs on the GPU.
    • However, users highlighted that bandwidth becomes critical for training models across multiple GPUs where high bandwidth is necessary.
    • Multi-GPU Configuration Queries: Inquiries about using multiple 3090s for large models revealed concerns over performance losses when exceeding a single GPU's memory.
    • It was determined that performance remains stable if the GPUs are identical, and offloading tasks improves overall processing efficiency.


aider (Paul Gauthier) Discord

  • Aider Users Report Slowness: Members reported slowness with Aider, particularly using litellm's get_model_cost_map function, which could be improved by setting export LITELLM_LOCAL_MODEL_COST_MAP='True'.

    • One user noted that Aider generally tries to mask litellm's slowness in most cases.
    • Recommendations for Web Scraping: A user suggested using FireCrawl for web scraping, citing its effective extraction capabilities and self-hosting options.
    • Discussions indicated that FireCrawl could overcome challenges faced with social media scraping when configured correctly.
    • Managing Git Repositories with Aider: Several users discussed strategies to maintain clean Git repositories, recommending manual commits over Aider's auto-commit feature.
    • One participant shared a process using git switch and merging squashed commits to keep repositories organized.
    • GitHub Copilot Rivals Aider: A member highlighted that Copilot's integration with OpenAI, Gemini, and Anthropic models could impact its competition with Aider.
    • Another user expressed dissatisfaction with Copilot and mentioned a switch to Supermaven, indicating shifting preferences in coding assistants.
    • Effective Prompt Engineering Insights: Discussions on crafting effective prompts emphasized their necessity for generating accurate AI outputs, focusing on providing ample context.
    • Concerns were raised about AI producing misleading results during debugging, prompting talk on restructuring prompts to improve clarity.


OpenAI Discord

  • Interest Grows in AI Research Grants: Members inquired about experiences with applying for AI research grants, highlighting increasing interest for funding innovative projects.

    • This reflects a broader trend where financial support is becoming crucial for new AI initiatives.
    • Fascination with Evolving Algorithms: Discussion centered around the evolution of algorithms, noting the differing personas emerging from AI models.
    • They've been pushing boundaries, with members eager to learn how these models manage various inputs.
    • Risks of Anthropomorphizing AI: Conversations revealed concerns that LLMs producing human-like output can lead to misleading assumptions of intention.
    • Members urged the importance of viewing AI as tools, rather than inferring human emotions.
    • Calls for Enhanced Ethical AI Guidelines: Members stressed the need for careful ethical considerations in AI to mitigate future risks.
    • Those developing intelligent systems bear the responsibility of setting clearer guidelines for their applications.
    • Concerning GPT Typo Issues: Members reported persistent typo issues and incoherence when using ChatGPT, raising alarm over output quality.
    • The community expressed confusion, questioning if others encountered similar problems.


Cohere Discord

  • Algorithmic Trading: Lessons Learned: A member with 4 years in algorithmic trading shared insights on the complexities of market interactions, stating that sloppy processes help resist negativity.

    • Understanding what doesn't work requires extensive simulated trades and research.
    • Understanding Media Bias in AI Sentiment: Members agreed that all media is biased, and identifying who benefits from that bias is crucial for accurate assessments.
    • One noted they built a model that starts investigations under the assumption that all media is biased.
    • Garbled AI Output Causing Confusion: Members reported seeing odd garbled text in AI model outputs, raising concerns about its usability.
    • Lowering temperature and top-p parameters was suggested as a potential fix, with experimentation recommended.
    • Insights on Response Lengths: Responses often stop when hitting the natural end based on structured prompts, with typical lengths being 3,000-4,000 characters.
    • A member emphasized personalization significantly influences output length.
    • Generating Medical Notes with LLMs: A demo showcased generating synthetic medical notes using LLMs, allowing users to create detailed notes with minimal input.
    • Check out the demo here to see the tool's capabilities.


OpenRouter (Alex Atallah) Discord

  • Inflection's services back online: The recent billing issues have been resolved, and Inflection is now operational, boosting productivity for all users. More details can be found in the links to Inflection 3 PI and Inflection 3 Productivity.

    • With services restored, users report a return to normal operations, enhancing their tasks previously affected.
    • Recruiting Alpha Testers for macOS Chat App: A developer is actively seeking alpha testers for their new flexible chat app for macOS, sharing screenshots that showcase its features.
    • Interested participants are encouraged to DM the developer to join this important testing phase.
    • OpenRouter API experiencing instability: Users have reported 524 errors affecting the OpenRouter API, causing significant requests delays and raising concerns about its readiness for public use.
    • As issues persist, some users are contemplating switching providers due to ongoing instability that hinders multiple request executions.
    • Debate over API key security risks: Concerns arose about potential scraping of API keys, with talks highlighting risks from unauthorized proxies using models like Claude 3.5 Sonnet.
    • Users stressed the significance of safeguarding keys, with worries about how vulnerabilities could result in unintended leaks despite existing precautions.
    • Integration access in high demand: Multiple members have voiced their requests for access to integrations, emphasizing polite pleas such as 'I would like to get access' to this feature.
    • One note-worthy request came from a student researcher, indicating academic interest in exploring the integration functionalities.


Latent Space Discord

  • Moondream secures $4.5M: Moondream raises $4.5M to demonstrate the effectiveness of smaller AI models, with web crawlers active for several months.

    • Concerns arose about potential limitations and the implications of adopting smaller models in the AI industry.
    • Meta develops its own AI Search Engine: Meta is reportedly working on an AI-powered search engine to reduce dependency on Google and Microsoft.
    • The active web crawlers hint at significant shifts within Meta to enhance their search capabilities.
    • GitHub Copilot adds Gemini and Claude models: GitHub introduces Gemini models and Claude to enhance its Copilot capabilities with new features.
    • This represents an unexpected partnership between Microsoft and Google as they embrace a multi-model approach for developers.
    • Critique of existing Vector Databases: Members critique current vector databases for lacking proper abstraction, endorsing the pgai Vectorizer for more efficient embedding management.
    • This tool promises to simplify syncing and maintenance of embeddings, crucial for boosting AI model performance.
    • OpenAI launches Chat History Search feature: OpenAI rolls out a new feature for ChatGPT allowing users to search through their chat history, enhancing accessibility to past discussions.
    • Members celebrated the convenience of this long-awaited update, emphasizing improved continuity in conversations.


Modular (Mojo 🔥) Discord

  • Modular channel focus clarified: In a query about the channel's focus, it was clarified that the <#1098713601386233997> channel is strictly for Modular products, with general software discussions directed to <#1104620458168553563>.

    • This distinction emphasizes the goal of maintaining focused discussions on Modular's offerings.
    • Mojo proposes memory-safe references revolution: A member released a major proposal on reimagining memory-safe references in Mojo, aiming for a safer yet simpler reference model.
    • Community feedback is sought to ensure the design supports both optimization flexibility and memory safety.
    • FlatBuffers and ProtoBuf comparison breakdown: The team weighed the strengths of FlatBuffers and ProtoBuf, noting the zero parsing efficiency of FlatBuffers against ProtoBuf's focus on bit packing.
    • As they plan to use ProtoBuf for Serving, a Swift ProtoBuf support example was shared as a development reference.
    • Swapping references in Mojo raises concerns: Members deliberated the potential pitfalls of implementing swapping references in Mojo, drawing comparisons to Rust's mutable references management.
    • Concerns were raised about the added complexity this might bring, especially regarding performance implications.
    • Optimization focus on noalias discussions: The discourse highlighted the significance of using noalias for efficient performance in Mojo, with many advocating it as a default approach.
    • A design supporting unique references was deemed essential, as lapsing here could lead to detrimental performance issues.


Eleuther Discord

  • Hugging Face CEO Generates Buzz: The co-founder & CEO of Hugging Face, Clem, is slated to give an exciting talk, which is stirring anticipation within the community.

    • Details about the talk are still to be revealed, keeping members eager for more information.
    • Hellaswag Training Performance Surpasses Expectations: A new record was set by achieving GPT-2 (1.5B) level performance on Hellaswag for under $200 in 7.3 hours, using 8xH100 hardware.
    • This represents a significant leap in efficiency from the previous benchmark of 24 8xH100-hours.
    • Operational GPT-NeoX on Colab Confirmed: GPT-NeoX is confirmed to work on Colab, with a reference link to a Colab notebook provided.
    • The model in use is compact, showing potential for practical implementations with its 5M parameters.
    • First Sparse Autoencoder Guide Launched: A member released a step by step guide on utilizing a Premade Sparse Autoencoder, marking a fresh initiative in Mechanistic Interpretability.
    • The guide sets the stage for a series aimed at enriching understanding of interpretability techniques.
    • Custom Certificate Support Issues Acknowledged: A member noted the absence of support for custom certificates, but shared a workaround that could help mitigate this limitation.
    • The discussion highlighted community efforts to share solutions that navigate these technical challenges.


Interconnects (Nathan Lambert) Discord

  • OpenAI CFO declares AI is mainstream: In a YouTube video, OpenAI CFO Sarah Friar emphasized that AI isn’t experimental anymore, as banks and fintechs are using it daily.

    • This momentous shift provides more opportunities for widespread implementation in various sectors.
    • SearchGPT Extension Launch: OpenAI is expected to promote their new Chrome extension, allowing users to set SearchGPT as their default search engine alongside its launch.
    • Users can quickly initiate searches directly via the browser URL bar using commands that redirect to Google as required.
    • Introduction of ROCKET-1: ROCKET-1 is designed to enhance creative tasks in Minecraft by utilizing visual-temporal context prompting and is showcased by Team CraftJarvis.
    • This development highlights the evolving capabilities of vision-language models in open-world applications.
    • Anthropic's Hiring Momentum: Anthropic is gaining attention for its strong hiring practices, manifesting interest with the announcement of a new team member joining their ranks.
    • Their recent push reflects the company’s vibrant growth and ambition in the AI sector.
    • Claude's Integration with GitHub Copilot: Claude 3.5 Sonnet is now available to developers using GitHub Copilot in Visual Studio Code, with rollout commencing this week.
    • This integration is expected to enhance coding experiences by providing advanced AI support directly within popular development tools.


OpenInterpreter Discord

  • Open Interpreter needs visual models for full features: For Open Interpreter to function properly with visual capabilities, a multi-modal model is generally required unless using Moondream for basic tasks.

    • Users reported difficulties replicating Sonnet or GPT-4o functionalities with local models such as Llava.
    • Challenges with local models executing actions: Members encountered issues using local models like Llava to perform actions akin to cloud models, such as taking screenshots.
    • There’s a call for improved setup instructions for better integration with the computer API.
    • OpenAI Advanced Voice launched for Free Users: OpenAI announced that Advanced Voice is now available to Free users in the EU, Switzerland, Iceland, Norway, and Liechtenstein.
    • This development significantly improves accessibility for users in these regions.
    • Apple offers $1M for AI server hacks: Apple is prepared to pay up to $1 million for anyone who successfully hacks into their AI servers.
    • This initiative raises concerns about cybersecurity and invites scrutiny into Apple's security measures.
    • ChatGPT introduces chat history search: OpenAI revealed the rollout of a search feature for chat history on ChatGPT web, enhancing usability for users.
    • This update allows users to quickly reference previous chats, improving continued interactions.


Torchtune Discord

  • Quantization Without LoRA Gets Attention: Members debate whether base models can undergo quantization like QLoRA without leveraging LoRA, highlighting configuration challenges in non-LoRA environments.

    • Hmmm I guess the main thing is we don't have a way to configure this in our non-LoRA model builders.
    • FSDP's Simple CPU Offloading Tested: Discussion centered around FSDP, which currently uses a single parameter for CPU offloading including parameters, gradients, and optimizer states, lacking detailed control.
    • This approach has more data movements, but potentially faster since optimizer step is on GPU was suggested as a performance consideration.
    • Skepticism Towards Quantized KV-Caches: Members voiced doubts about the utility of quantized KV-caches using NF4 tensors due to high memory consumption in larger models.
    • I don't think quantized kv cached in torchao is that useful/powerful yet, indicating a need for further exploration.
    • Quantizing Non-Trainable Weights Gaining Interest: Conversations highlighted that quantizing frozen weights during PPO could help in reducing memory use, particularly for non-trainable model components.
    • Yeah I'd like to do something similar and quantize the non-trainable models during PPO, showing interest in memory efficiency strategies.
    • Accuracy Risks Below 8-bit Quantization: Concerns emerged over accuracy when quantizing activations, specifically KV caches, below 8-bit limits.
    • Quantizing activations to below 8bit will have pretty severe accuracy issues, emphasizing caution in aggressive quantization approaches.


DSPy Discord

  • PAPILLON tackles AI privacy concerns: Researchers developed PAPILLON, achieving 85.5% quality with only 7.5% privacy leaks in AI applications.

    • This system effectively allows the integration of local and cloud LLMs, addressing significant privacy challenges in modern AI.
    • PUPA benchmark shines light on privacy issues: The team introduced PUPA, a benchmark assessing user-LLM interactions that contain personally identifiable information (PII).
    • Their findings inform a new method called Privacy-Conscious Delegation, merging API-driven and local model approaches.
    • DSPy simplifies AI programming: An ELI5 explanation of DSPy described it as a programming language allowing AI systems development through normal Python with DSPy signatures.
    • DSPy offers Modules for handling prompting strategies and Optimizers focused on enhancing output quality.
    • MIPROv2 Optimizer boosts quality: Discussions revealed that MIPROv2 optimizer provides a 41% increase in output quality and a 68% decrease in leakage when utilized effectively.
    • Users noted its capability to sample training data and generate instructions based on various properties, optimizing overall performance.
    • MIPROv2 bug fix resolves usage issues: A report surfaced about an error with MIPROv2 when paired with GPT-4o Mini, contrasting its successful runs with GPT-4.
    • Adjusting demo parameters helped resolve the confusion and improved performance with medium configurations.


LlamaIndex Discord

  • NVIDIA spotlights wants in RAG: NVIDIA's latest blog delves into retrieval augmented generation (RAG), revealing that users desire extra functionalities, including document translation and code writing.

    • Even those focusing on internal data showed interest in web search capabilities, implemented through Perplexity’s search API.
    • Chroma's retrieval algorithm raises eyebrows: Discussion emerged around Chroma's vector store retrieval behavior, particularly when using index = GPTVectorStoreIndex.from_vector_store(vector_store=vector_store).
    • Members highlighted that Chroma's algorithm is approximate, affecting the variability of results even with similar indexed chunks.
    • Web scraping mastery unveiled: A practical YouTube video titled 'This is how I scrape 99% websites via LLM' was shared, showcasing advanced web scraping capabilities for 2024.
    • The video advocates for using AgentQL to scrape websites for free, demonstrating real-world applications of LLMs.
    • Blockchain engineer seeks project collaborations: A blockchain engineer with roots from 2017 reached out for project opportunities, boasting expertise in defi, NFT games, and languages like Solidity and RUST.
    • Their background includes work on various projects involving Dex, DAO, and NFT minting and staking.
    • Building advanced RAG systems with MLflow: A guide outlined how to create advanced RAG systems utilizing MLflow and LlamaIndex, allowing for a combination of vector and keyword-based searches.
    • This approach targets event-driven orchestration to enhance workflow management, as illustrated in an example available on GitHub.


LLM Agents (Berkeley MOOC) Discord

  • LLM Agents Hackathon Registration Surges: Over 1K+ innovators have signed up for the LLM Agents Hackathon within just a few days, reflecting strong interest. Complete the participant sign up today if you haven’t joined yet!

    • It’s not too late to join us!
    • 8th Lecture Scheduled at 3:00pm PST: The 8th lecture will take place today at 3:00pm PST, with a livestream available here. This session focuses on integrating complex reasoning with Large Language Models, promising valuable insights.
    • Tune in!
    • Formation of a Study Group: A member proposed starting a study group for course discussions, suggesting virtual meetings to engage those who joined late. Expressions of interest followed quickly with several members confirming they wanted to participate.
    • Sounds cool!
    • Request for Subtitles on Live Stream: A member requested to enable Subtitles on the live streaming videos, with confirmation that all lectures are edited afterwards and made available with subtitles. This ensures accessibility, enhancing the viewer experience.
    • We’re working on it!
    • Developing React-based Automation Agent: A member inquired about creating a React-based agent to automate tasks using pyauto gui for actions based on current state evaluations. Suggestions for direct inquiries rather than generalized questions were noted.
    • It's easier to ask directly!


LAION Discord

  • Pink Pixel Patches in Latent Diffusion Model Training: While training a class conditioned latent diffusion model, a member reported encountering pink pixel patches during decoding from the VAE, which decrease in frequency with more training.

    • They are considering if more aggressive clipping in DDIM p_sample, currently at 99.95%, will solve the issue of these patches.
    • Misunderstanding Parameters vs Tokens: A member mistakenly thought the 100B reference was for parameters over tokens, which led to a mix-up clarified by another member's acknowledgment.
    • Further, they noted the linked model actually has only 8B parameters, gaining validation from peers.
    • Collaborative Exploration of IJEPA Architecture: A member expressed interest in collaborating on an innovative architecture that merges IJEPA with autoregressive image generation without vector quantization.
    • Their enthusiasm for joint efforts to explore this unique architecture signals potential advancements in this space.


tinygrad (George Hotz) Discord

  • George Hotz Has a Negative Line Day: George Hotz expressed having a negative line day, prompting humorous reactions from the community.

    • This light-hearted exchange reflects the supportive atmosphere among members as they tackle coding challenges.
    • CI Tests Get Faster: Chenyuy reported a 2-minute faster CI test, indicating progress in performance optimization.
    • This improvement in the testing process showcases shared efforts to boost efficiency in the tinygrad project.
    • Uops Readability Challenges Surface: Concerns emerged regarding the readability of Uops with some one-liners being difficult to comprehend.
    • A suggestion for creating a documentation page was mentioned to potentially enhance code clarity for all users.
    • Documentation Maintenance Issues Highlighted: Chenyuy highlighted the maintenance concerns regarding documentation that often becomes outdated quickly.
    • He pointed out that having inaccurate documentation may hinder progress more than having none, reflecting the rapid pace of change in tinygrad.
    • Debate on Premature Optimization: George Hotz proposed the removal of certain code elements to avoid premature optimization pitfalls.
    • This discussion underscores the thoughtful testing underway to balance code efficiency carefully against potential complexities.


LangChain AI Discord

  • RAGAS Enhances LLM Evaluations: A member suggested using RAGAS to improve LLM application evaluations, showcasing its capabilities and methodologies.

    • This tool aims to provide developers with refined methods for evaluating language models effectively.
    • CSV Files Seeking Integration: A discussion arose about integrating CSV files as data sources with open source models like LLAMA3, noting a gap in existing examples.
    • The inquiry specifically mentioned using CSVChain and PandasAgent with non-OpenAI models for better data handling.
    • LangChain-Python Version Queries: Clarification was sought on which version of Python compatible with LangChain version 0.3, reflecting the community's need for setup guidance.
    • Proper environment configuration remains crucial for developers to use LangChain efficiently.
    • LangChain-JS Course Launches: Exciting news! A new LangChain-JS course has been released on Udemy, aimed at beginners.
    • It spans from the basics to building a complete RAG application, with the first 100 students able to enroll for free.
    • Web Scraping Masterclass: A member highlighted a YouTube video titled 'This is how I scrape 99% websites via LLM', which teaches practical web scraping with LLM.
    • It emphasizes the use of AgentQL to scrape websites for free, showcasing innovative techniques.


Gorilla LLM (Berkeley Function Calling) Discord

  • Clarifying 'Multiple' on the Leaderboard: 'Multiple' on the leaderboard indicates the ability to choose the correct function from several options in a single turn, as outlined in this GitHub example. The evaluation of multi-step remains ambiguous in this context.

    • This confusion is notable, especially regarding how multi-step executions differ from multi-turn scenarios, which has led to various discussions among users.
    • Multi-Step vs Multi-Turn Evaluation Methods: A member clarified that 'multiple' relates to functions, while multi-step evaluations fall under the 'multi_turn' category, with no singular multi-step evaluation currently utilized. Understanding these distinctions is crucial for accurate interpretation.
    • The overlap between multi-step and multi-turn evaluations could potentially confuse users, as both concepts share the same categories in evaluations as set by the leaderboard.


LLM Finetuning (Hamel + Dan) Discord

  • Cracked Engineers Job Platform Launches!: A member shared an exciting new job platform for technical roles called Cracked Engineers, aiming to be the go-to for top AI/tech startups.

    • With a projected $1000 MRR before the official launch, the platform is already attracting top companies like Weaviate, UnslothAI, and JuliusAI.
    • Insightful Weekly Tech Jobs Newsletter Introduced: The platform is set to release a weekly tech jobs newsletter that will curate positions based on user preferences, starting soon.
    • Users can subscribe to tags that interest them, such as CUDA, MLOps, or Software Engineering through their dashboard.
    • Exciting Job Opportunities at AI Startups: Unsloth AI, Julius AI, and Jimini AI are actively hiring for excellent positions that they would consider if they weren't a founder.
    • These positions are described as amazing opportunities for anyone looking to work with cutting-edge AI technology.


OpenAccess AI Collective (axolotl) Discord

  • Member Seeks SymNoise Code Implementation: A member is looking for a code implementation for the SymNoise fine-tuning technique, which integrates symmetric noise into embedding. They expressed difficulties in achieving this due to issues with the batch size requirements.

    • This inquiry shows a growing interest in advanced fine-tuning methods within the community, though specific solutions were not provided.
    • SymNoise Boosts LLaMA-2-7B Performance: The SymNoise method improved the LLaMA-2-7B performance on AlpacaEval from 29.79% to an impressive 69.04%, surpassing NEFTune. This signifies a significant 6.7% enhancement over NEFTune's score of 64.69%, as noted in the paper's abstract.
    • The results highlight the potential of SymNoise in fine-tuning language models, setting a new benchmark for performance.
    • SymNoise Outshines NEFTune Across Models: Tests reveal that SymNoise consistently yields better results than NEFTune across various models and baseline datasets. This has sparked discussions about the need for further research in this area.
    • Community members emphasized the importance of continuing to explore and validate these fine-tuning methodologies.
    • Call for Research Resources on SymNoise: In the inquiry, a member linked to the arXiv paper detailing the SymNoise method, underscoring its relevance in the field. However, there were no additional code resources or implementations shared to aid in the implementation challenge.
    • This points to a broader need for collaborative efforts in developing practical applications based on recent research findings.


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: !

If you enjoyed AInews, please share with a friend! Thanks in advance!

Don't miss what's next. Subscribe to AI News (MOVED TO news.smol.ai!):
Share this email:
Share on Twitter Share on LinkedIn Share on Hacker News Share on Reddit Share via email
Twitter
https://latent....
Powered by Buttondown, the easiest way to start and grow your newsletter.