aider (Paul Gauthier) ▷ #questions-and-tips (167 messages🔥🔥):

Links mentioned:

Links mentioned:

Eleuther ▷ #announcements (1 messages):

Link mentioned: Build software better, togethering-to-webassembly-and-webgpu">: GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.


Eleuther ▷ #general (379 messages🔥🔥):

Links mentioned(Alex Atallah) ▷ #[general](https://discord.com/channels/1091220969173028894/1094454198688546826/1286801967452262432)** (350 messages🔥🔥): >:
: Good stuff. Pro tip: do the red circles i checked will get you 99% there. (but dont scale the head dim) https://blog.eleuther.ai/mutransfer/st finished up an exclusive interview going over a new, major AI model upgrade. Can confirm, tomorrow will be a big day for developers. Dropping the full conversation on X the second the embargo...
  • Bad scalingade. Can confirm, tomorrow will be a big day for developers. Dropping the full conversation on X the second the embargo...
  • : At the simplest level, neural networks are trained by iterating the following operation: where learning_rate is a float and gradient is the gradient of the loss function with respect to the weights...B-Stheno-v3.3-32K · Hugging Face
  • Introducing RWKV - An RNN with the advantages of a transformerty | OpenRouter: no description foundow you've been using models on OpenRouter.
  • Session 2A & 2B: Optimization Non Convexts/321">: Watch this video with AI-generated Table of Content (ToC), Phrase Cloud and In-video Search here:https://videos.videoken.com/index.php/videos/icml-2018-sessi...est open source models. Rate limits in the Free tier M...
  • Sell Domains | Buy Domains | Park Domains href="https://status.anthropic.com/incidents/xts3kyr0nrx1">: no description foundropic.com/incidents/xts3kyr0nrx1">
  • zeroshampoo/distributed_shampoo.py at main · cloneofsimo/zeroshampoo | OpenRouter: Contribute to cloneofsimo/zeroshampoo development by creating an account on GitHub.er.ai/models/anthropic/claude-3.5-sonnet/providers">
  • google-research/scalable_shampoo/jax/shampoo.py at master · google-research/google-researchnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. S...
  • : Google Research. Contribute to google-research/google-research development by creating an account on GitHub. > > >


    Eleuther ▷ #research (206 messages🔥🔥):

    Links mentioned(Alex Atallah) ▷ #[beta-feedback](https://discord.com/channels/1091220969173028894/1277894087755829278/1287062592883261531)** (1 messages): >:
  • Examples/Model_Homotopy/LinRebal.ipynb at main · WinVector/Examplestweet is entirely on brand. Qu...
  • : Various examples for different articles. Contribute to WinVector/Examples development by creating an account on GitHub.ound
  • openai/MMMLU · Datasets at Hugging FaceGIFs: no description found > - **Inquiry about Private LLM Servers**: A member inquired whether others are running **private LLM servers** themselves or if they are managed by a third party. - *Out of curiosity, are you running private llm servers yourself?* - **Response to Request on Servers**: The conversation opened with a thank you for a request, signaling engagement in an ongoing discussion about LLM server management. - The member’s response suggested curiosity around the operational aspect of these servers. --- ### **Nous Research AI ▷ #[general](https://discord.com/channels/1053877538025386074/1149866623109439599/1286782880638566400)** (211 messages🔥🔥): > - **Inquiry about Private LLM Servers**: A member inquired whether others are running **private LLM servers** themselves or if they are managed by a third party. - *Out of curiosity, are you running private llm servers yourself?* - **Response to Request on Servers**: The conversation opened with a thank you for a request, signaling engagement in an ongoing discussion about LLM server management. - The member’s response suggested curiosity around the operational aspect of these servers. --- ### **Nous Research AI ▷ #[general](https://discord.com/channels/1053877538025386074/1149866623109439599/1286782880638566400)** (211 messages🔥🔥): >


    Eleuther ▷ #scaling-laws (10 messages🔥):

    • Irreducible Loss Calculation
    • Chinchilla Optimal Token Size
    • Empirical Estimations
    • Scaling Laws Insights


    Eleuther ▷ #interpretability-general (61 messages🔥🔥):

    • Interpretability at EMNLP2024
    • KV Cache Experiments
    • Model Training Interventions
    • Sparse Feature Circuits
    • SAE and Transformer Interpretability
    Links mentioned- `CUDA OOM Issues with Llama 3.1` >:
  • mistral-common/src/mistral_common/tokens/tokenizers at main · mistralai/mistral-commonus on building an intuitive understanding of attention. The attention mechanism was introduced in the “Attention Is All You Need” paper. It is the key element in the transformer...
  • : Contribute to mistralai/mistral-common development by creating an account on GitHub.uced in the “Attention Is All You Need” paper. It is the key element in the transformer...
  • GitHub - unslothai/unsloth: Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memorylone: Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory - unslothai/unsloth your understanding was a lot shallower than you thought? W...
  • Support KTO Trainer with Unsloth by corbt · Pull Request #1001 · unslothai/unsloth954421988141711382/1168411509542637578/1286936156386234368)** (38 messages🔥): >: This patch appears to be both necessary and sufficient to successfully use KTOTrainer with Unsloth!cus` >
  • Home - `Hackathon Sponsorship Requests` >: Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory - unslothai/unslothcludes many areas**: Cohere works on various topics including language models, efficiency, safety, multilingual capabilities, RL, and AI policy, with resources available on their [research papers page](https://cohere.com/research/papers). - **Performance Issues with Azure SDK**: A user reported that their implementation of the Command R+ model using the Azure SDK underperformed significantly compared to using the Cohere SDK, leading to frequent hallucinations in responses. - Despite updating the Azure implementation to a lower temperature and removing certain parameters, the issues persisted. - **Cohere Reranker API is hosted across multiple locations**: Cohere's Reranker API endpoint can be hosted on their platform or other cloud providers, as indicated by a team member. - They clarified that they have servers in multiple locations, rather than being limited to a US-based server. - **Hackathon Sponsorships Currently Unavailable**: A user inquired about potential sponsorship for a hackathon, which prompted a staff member to direct them to a specific contact. - However, it was noted that Cohere is not currently accepting sponsorship requests. - **Connectors Compatibility in APIs**: It was mentioned that the current connectors in Cohere's APIs may only be compatible with their native platform. - Users were encouraged to explore options like the Brave Search API as an alternative solution.
  • GitHub - infiniflow/ragflow: RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.rs page](https://cohere.com/research/papers). - **Performance Issues with Azure SDK**: A user reported that their implementation of the Command R+ model using the Azure SDK underperformed significantly compared to using the Cohere SDK, leading to frequent hallucinations in responses. - Despite updating the Azure implementation to a lower temperature and removing certain parameters, the issues persisted. - **Cohere Reranker API is hosted across multiple locations**: Cohere's Reranker API endpoint can be hosted on their platform or other cloud providers, as indicated by a team member. - They clarified that they have servers in multiple locations, rather than being limited to a US-based server. - **Hackathon Sponsorships Currently Unavailable**: A user inquired about potential sponsorship for a hackathon, which prompted a staff member to direct them to a specific contact. - However, it was noted that Cohere is not currently accepting sponsorship requests. - **Connectors Compatibility in APIs**: It was mentioned that the current connectors in Cohere's APIs may only be compatible with their native platform. - Users were encouraged to explore options like the Brave Search API as an alternative solution.
    : RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. - infiniflow/ragflow **Link mentioned**: **Link mentioned**: --- ### **Nous Research AI ▷ #[reasoning-tasks](https://discord.com/channels/1053877538025386074/1264666760972472481/1287059116480532504)** (17 messages🔥): >


    Unsloth AI (Daniel Han) ▷ #off-topic (24 messages🔥):

    • RAG Application Use
    • Cost Analysis for Document Rating
    • Inference Methods Comparison
    • API Services Discounts
    • Vote Accuracy in Ratings
    • Exploring RAG Applications for Document Structuring: A member suggested using a RAG application to convert unstructured documents into a structured format before conducting analysis.
      • Another member clarified that their task involves L3.1 ratings and is focused on offline inference rather than creating a fine-tuning dataset.
    • Costly Estimates for Document Processing: Discussion revealed that running an analysis on 2.5 million documents with high token counts could cost around $60k without labor.
      • One member calculated that using an API for L3.1 would cost approximately $15k, indicating a significant savings compared to on-prem configurations.
    • Comparing Inference Methods: Members debated the benefits of various inference methods, noting that throughput of 8x H100 models could offer faster results than anticipated.
      • Testing with 2000-5000 samples was recommended to evaluate cost and accuracy effectively.
    • API Services with Discounts: A member raised the question of whether any API services offer discounts, particularly highlighting OpenAI's previous 50% off on batch inferences.
      • Concerns were shared about the high costs and limitations of using larger models versus the unsatisfactory performance of smaller ones.
    • Three Votes for Enhanced Accuracy: Members discussed the importance of obtaining three votes from different models to ensure accuracy in ratings.
      • One member confirmed they will implement this approach in their testing strategy.


    Unsloth AI (Daniel Han) ▷ #help (76 messages🔥🔥):

    • Prediction Loss Only Evaluation
    • Phi 3.5 Tokenization Issues
    • RAG Fine Tuning Best Practices
    • Merged Model Performance Challenges
    • Continued Pre-training with Lora Adapters
    • Prediction Loss Only for VRAM Efficiency: A user asked about the purpose of using prediction_loss_only = True in the training loop to prevent VRAM usage from escalating.
      • Concerns were raised regarding whether it affects evaluation passes only.
    • Tokenization Concerns with Phi 3.5: A user noted discrepancies in tokenization between the model and tokenizer in Phi 3.5, leading to confusion about padding tokens.
      • Additionally, there were issues with the tokenizer not adding special tokens during encoding, which could impact training.
    • Best Practices for RAG Fine Tuning: One member inquired about templates for fine-tuning RAG models with context, questions, and answers, highlighting the complexity.
      • Suggestions included exploring research papers for guidance, indicating this is a nuanced area.
    • Performance Issues Post Model Merging: Users reported that the performance of their models declined significantly after merging Lora adapters with the original weights.
      • Concerns were expressed about the effectiveness of 4bit merges compared to 16bit merges.
    • Continued Pre-training with Lora Adapters: A user sought clarity on how continued pre-training would interact with existing Lora adapters, questioning if new ones would be created.
      • It was advised to save the merged model for future training flexibility, emphasizing the importance of maintaining a merged state.

    Link mentioned: Google ColabDo LLMs and Transformers Need It?: no description found


    Unsloth AI (Daniel Han) ▷ #research (3 messages):

    • SetFit v1.1.0 Release
    • Training Classifiers
    • Sentence Transformers Update
    • Python Version Support
    • SetFit v1.1.0 Launches with Improved Training: The release of SetFit v1.1.0 now utilizes the Sentence Transformers Trainer for efficient classifiers training on both CPU and GPU, addressing multiple issues from third-party library updates.
      • The new version introduces MultiGPU support and deprecates the 'evaluation_strategy' in favor of 'eval_strategy', along with new support for Python 3.11 and 3.12.
    • Two Phases of SetFit Classifier Model Training: Training a SetFit classifier model consists of two main phases: finetuning a Sentence Transformer embedding model followed by a classifier that maps embeddings to classes.
      • This structured approach enhances performance and efficiency, particularly with the updated support features in version 1.1.0.
    • Key Updates in SetFit's Training Process: Significant improvements have been made in parameters like max_steps and eval_max_steps, which are now enforced as hard limits, ensuring more reliable training outcomes.
      • Changes in training and validation losses were also highlighted, contributing to the overall robustness of the training process.

    Link mentioned: @tomaarsen on Hugging Face: "🎉SetFit v1.1.0 is out! Training efficient classifiers on CPU or GPU now uses…" their research papers page. - Performance Issues with Azure SDK: A user reported that their implementation of the Command R+ model using the Azure SDK underperformed significantly compared to using the Cohere SDK, leading to frequent hallucinations in responses. - Despite updating the Azure implementation to a lower temperature and removing certain parameters, the issues persisted. - Cohere Reranker API is hosted across multiple locations: Cohere's Reranker API endpoint can be hosted on their platform or other cloud providers, as indicated by a team member. - They clarified that they have servers in multiple locations, rather than being limited to a US-based server. - Hackathon Sponsorships Currently Unavailable: A user inquired about potential sponsorship for a hackathon, which prompted a staff member to direct them to a specific contact. - However, it was noted that Cohere is not currently accepting sponsorship requests. - Connectors Compatibility in APIs: It was mentioned that the current connectors in Cohere's APIs may only be compatible with their native platform. - Users were encouraged to explore options like the Brave Search API as an alternative solution.

    : no description found


    Perplexity AI ▷ #general (506 messages🔥🔥🔥):

    • Perplexity Pro issues
    • Usage of AI models
    • Anthropic model release
    • Perplexity functionality
    • Collaborative opportunities
    • Perplexity Pro Subscription Issues: Several users reported losing their Pro status intermittently, with some experiencing error messages like 'Query rate limit exceeded'. Many noted that logging out and back in sometimes resolved the issue, but concerns about lag and bugs persisted.
      • These problems appear to be system-wide, possibly linked to recent updates and maintenance being performed on the platform.
    • AI Model Comparisons and Use Cases: Users discussed the effectiveness of different AI models, including Perplexity, ChatGPT, and Claude, highlighting their respective strengths in various applications. Insights were shared on how to optimize their usage for tasks like programming, brainstorming, and academic research.
      • Many noted the challenges with certain models, especially regarding hallucinations and the reliability of real-time information retrieval.
    • Potential Launch of New Anthropic Model: The community buzzed about the potential drop of a new model from Anthropic, suggested to be announced shortly based on an exclusive interview shared by a user. This generated excitement for additional capabilities that new AI models may bring.
      • There were skeptical comments regarding whether Perplexity would incorporate any new models soon, hinting at the competitive landscape.
    • Concerns About Perplexity's Shift Towards Ads: User feedback voiced concerns over recent changes in how products and ads are displayed within the Perplexity interface, finding it distracting. Suggestions were made to place recommendations in a sidebar rather than inline with the search results to enhance usability.
      • Users expressed disappointment over perceived shifts towards a more commercial model, which they feared could detract from the unique value Perplexity was originally set to provide.
    • User Experience Enhancements and Collaborations: Discussion about the Complexity extension highlighted benefits that enhance user experience on Perplexity, such as customizable themes and easier navigation. Users shared collaborative opportunities and expressed interest in improving their workflow with AI tools.
      • The importance of community-driven feedback and understanding how to leverage these tools effectively was emphasized as crucial for enhancing the platform.
    Links mentionedocess` - **Cohere API geolocation restrictions confirmed**: It's confirmed that **Cohere does geolock**, which might be causing API access issues when migrating servers to different locations like Finland or Germany. - *Email support@cohere.com* for assistance in resolving these geolocation access permissions. - **Embedding call requires 'embedding_types' parameter now**: A user reported their **embedding call** started erroring with '`embedding_types parameter is required`', despite the documentation stating it was optional previously. - This change in behavior was questioned, prompting clarification from the Cohere team. --- ### **Cohere ▷ #[projects](https://discord.com/channels/954421988141711382/1218409701339828245/1287130812902281237)** (1 messages): >:
  • Don't miss what's next. Subscribe to AI News:
    Twitter Newsletter