[AINews] AI Discords Newsletter 11/29/2023

I've been trying for several hours now to get my bot to handle coding tasks but it keeps timing out.

                November 29, 2023

            [AINews] AI Discords Newsletter  11/29/2023

This is AI News! an MVP of a service that goes thru all AI discords/Twitters/reddits and summarizes what people are talking about, so that you can keep up without the fatigue. Signing up here opts you in to the real thing when we launch it 🔜

Latent Space Discord Summary
▷ Channel: ai-general-chat (9 messages):
▷ Channel: ai-event-announcements (2 messages):
▷ Channel: llm-paper-club (4 messages):

OpenAI Discord Summary
▷ Channel: ai-discussions (109 messages):
▷ Channel: openai-chatter (211 messages🔥):
▷ Channel: openai-questions (111 messages):
▷ Channel: gpt-4-discussions (86 messages):
▷ Channel: prompt-engineering (22 messages):
▷ Channel: api-discussions (22 messages):

LangChain AI Discord Summary
▷ Channel: general (53 messages):
▷ Channel: langchain-templates (7 messages):
▷ Channel: share-your-work (1 messages):
▷ Channel: tutorials (1 messages):

Nous Research AI Discord Summary
▷ Channel: benchmarks-log (1 messages):
▷ Channel: interesting-links (7 messages):
▷ Channel: announcements (1 messages):
▷ Channel: general (165 messages🔥):
▷ Channel: ask-about-llms (77 messages):

Alignment Lab AI Discord Summary
▷ Channel: ai-and-ml-discussion (1 messages):
▷ Channel: general-chat (3 messages):
▷ Channel: oo (1 messages):

Skunkworks AI Discord Summary
▷ Channel: general (2 messages):
▷ Channel: ablateit-wandb-alerts (1 messages):

LLM Perf Enthusiasts AI Discord Summary
▷ Channel: general (6 messages):
▷ Channel: gpt4 (8 messages):
▷ Channel: finetuning (3 messages):
▷ Channel: opensource (4 messages):
▷ Channel: collaboration (1 messages):
▷ Channel: speed (1 messages):
▷ Channel: feedback-meta (1 messages):
▷ Channel: openai (2 messages):
▷ Channel: prompting (8 messages):

MLOps @Chipro Discord Summary
▷ Channel: events (1 messages):
▷ Channel: general-ml (3 messages):

Perplexity AI Discord Summary
▷ Channel: announcements (1 messages):

Latent Space Discord Summary

The development and discussion of an AI search engine tool that uses pgvector and tree-sitter, shared by @estebanvargas32 for feedback: link to his project.
Conversation revolving around the use of GPT for better prompt engineering, highlighted by @jozexotic's interest in a Discord summarizer and @slono's sharing of useful GPT agents.
@guardiang's creation of a GPT specifically for handling prompts, with a demo link shared for user testing.
Announcement and commencement of a discussion on RAG context management by @swyxio and @315351812821745669: discussion link.
Release of a new podcast episode announced by @swyxio, available for listeners on Twitter and HN: podcast link.
The share by @swyxio of an essential papers selection recommended by Jason Wei for thorough foundational understanding of several AI aspects.
Unclear note about the Chain of Note paper club notes by @swyxio, though no specific details were provided.
Questions raised by @yikesawjeez about the context reversal paper and @zf0 about the list of already covered papers, with both inquiries left unanswered.

Latent Space Channel Summaries
▷ Channel: ai-general-chat (9 messages):

AI Search Engine Discussion: @eugeneyan, @willow_ghost and @estebanvargas32 discussed about a search engine tool built by @estebanvargas32 using pgvector and tree-sitter. He shared the link to his project for feedback.
Useful GPT for Better Prompt Engineering Discussion: @jozexotic queried about useful GPT tools for better prompt engineering. He also mentioned his interest in Discord summarizer. 
Sharing Useful GPT Agents: @slono shared links to the GPT agents he found useful. He also mentioned an agent which could look up things in its knowledge base upon explicit request but criticized it for being slow.
Prompt Handling GPT: @guardiang put together a GPT for handling prompts. He shared the link for users to try it out.

▷ Channel: ai-event-announcements (2 messages):

RAG Context Management: @swyxio announced a discussion start in 15 minutes, featuring @315351812821745669 on the subject of RAG context management. The meeting could be joined via this link.

New Podcast Episode: @swyxio shared the news of a new podcast episode live on Twitter and HN. The podcast could be accessed here.

▷ Channel: llm-paper-club (4 messages):

Essential Papers Selection: User @swyxio shared a selection of essential papers recommended by Jason Wei:
    - GPT-3 paper
    - Chain-of-thought prompting
    - Scaling laws
    - Emergent abilities
    - Language models can follow both flipped labels and semantically-unrelated labels
Paper Club Notes: @swyxio noted about the Chain of Note paper club notes, however, no specific link or details were provided.
Contextual Reversal Paper Inquiry: @yikesawjeez asked about the title of the context reversal paper, but no response was provided.
List of Covered Papers: @zf0 inquired if there is a list of already covered papers, implying the existing spreadsheet might be outdated. No response was provided.

OpenAI Discord Summary

Discrepancies in the performance of ChatGPT/GPT-4 design models prompted a variety of user reactions. Some users were concerned about perceived declines in ChatGPT's performance, whilst others defended its capabilities. There were specifically discussions about employing GPT for tasks such as converting video interviews to written form or using it for rewriting in specific tones.
    - "I've been trying for several hours now to get my bot to handle coding tasks but it keeps timing out." – @hankliss (ai-discussions)
    - "GPT-4 Turbo is an enhanced version of the original GPT-4, which handles large data volumes better." – @lumirix (openai-chatter) 
Discussions arose about the differences between GPT-4 and GPT-4 Turbo, with a few users citing low performance output with GPT-4 Turbo. Users also discussed GPT Plus subscriptions and its usage limits, alongside the potential for academic misconduct if GPT is used in academic settings.
    - "I feel like GPT-3.5 was better than GPT-4 in solving coding issues." – @loschess (ai-discussions)
    - "I've had slower response times on all three browsers I've tried…would love some insight." – @sl0vius (openai-chatter)
Reported issues with the ChatGPT platform covered processing issues, saving issues, and problems with VPNs affecting ChatGPT usage. There were also queries about integrating GPT with platforms like Instagram and Whatsapp, as well as uploading documents to ChatGPT.
    - "I'm trying to save my new GPTs but keep getting an error." – @jacquelynyakira (openai-questions)
    - "Why can't I use ChatGPT on my desktop browsers, but I can on mobile?" – @apelambo (openai-questions)
Users shared their experiences with customizing ChatGPT, and some offered their custom GPTs for feedback and review. They also discussed the limitations of languages models, especially when handling complex symbols.
    - "My GPT doesn't seem to refer to the source documents I provide and instead uses its default knowledge." – @pietman (gpt-4-discussions)
    - "If understanding how GPT-3 interprets symbols like {}, [], and () could solve a lot of issues." – @mysticmarks1 (prompt-engineering)
Discussions revolved around the use of APIs and prompts in relation to GPT. They discussed the potential of creating a GPT for prompt generation, dealing with prompt issues, and how to identify the GPT-4 model version.
    - "I was wondering if I could use my GPT in my own personal projects through an API." – @ionknowu (gpt-4-discussions)
    - "Has anyone figured out how we can identify and sort GPT model versions?" – @hoodiewoody (prompt-engineering)

OpenAI Channel Summaries
▷ Channel: ai-discussions (109 messages):

Perceived Degradation of ChatGPT's Performance: Several users, including @jaade77, @johnpringle, @loschess, @simple_chad, and @hankliss, expressed concern over perceived decrease in ChatGPT's performance and capabilities. Some cite difficulties in the bot's handling of coding tasks, timing out after a certain number of messages, and issues with its tone adjustment capabilities.
Defending ChatGPT's Performance: Users like @n8programs and @themandalorian defend ChatGPT, stating that its performance has not deteriorated, but people are discovering more of its limitations or perhaps not using the tool correctly.
Comparison Between GPT-4 and GPT-3.5: @loschess shared an experience where GPT-3.5 reportedly performed better than GPT-4 in solving a coding issue.
Employment of GPT for Specific Tasks: Various discussions about employing GPT for tasks such as converting a video interview to a written form by @jaade77, using GPT to rewrite for tone by @jaade77, and using it to write code to combine text files by @themandalorian.
Queries for Assistance and Interest in AI: Several users such as @martinr_33972, @healer9071, and @ankur1900 posted links to the bots they've created, sought assistance with the huggingface.co platform, or called for collaboration in creating an efficient solution for a coding challenge.

▷ Channel: openai-chatter (211 messages🔥):

GPT-4 vs. GPT-4 Turbo: Users discussed the difference between GPT-4 and GPT-4 Turbo. @lumirix explained that GPT-4 Turbo is the enhanced version of GPT-4, while @captainsonic and @n8programs affirmed Turbo's improved performance and cost-effectiveness. There was a debate about whether the latest GPT-4 Turbo version was a downgrade, with users like @emilybear, @nexor7, and @g0lo noticing a poor output. User @.pythagoras pointed out that the Turbo version trades off detailed comprehension of extensive prompts in favor of managing large volumes of information due to hardware constraints.

GPT Plus Subscriptions and User Accounts: @hahalol and @iamsivart checked on the status of GPT Plus subscriptions, with the others confirming the current waitlist. @xannaeh raised a concern regarding using a company credit card to pay for multiple employee accounts, to which @elektronisade responded by sharing an OpenAI help article regarding declined credit card issues. It was further mentioned that company-wide use of the same account is not feasible, nor is self-hosting of the ChatGPT system.

Usage Limits and Loading Speeds: Users complained about slower loading speeds and the system's performance during peak hours. @lk_jinxed found GPT 3.5 to be laggy, while @sl0vius experienced slow response times on three different web browsers. @7877 responded that this delay is normal during peak load periods and recommended waiting for smoother performance.

Platform's Future and GPT Store: Users are waiting for updates on the GPT store, with @dydzio specifically asking for new information. Although no new information is available yet, @eskcanta and @solbus speculated on how the store's launch might affect current user usage and the management of the Plus subscription waitlist.

AI-driven Writing and Use in Academia: @lefoudurex sought help with a university essay but was cautioned by @dabuscusman about the potential academic misconduct risk associated with using AI writing assistance in an academic setting. @offline recommended a bit-by-bit approach when using AI for rephrasing large documents.

▷ Channel: openai-questions (111 messages):

ChatGPT Issues: Many users reported various issues with the ChatGPT platform. For instance, @mehmettrkmen experienced an issue with analysing files, @enryurii encountered issues with image processing and @jacquelynyakira couldn't save her customGPTs. @solbus and @satanhashtag tried to assist with solutions and workarounds for these issues.
Integrating GPT with Other Platforms: @nurik7086 inquired about how to integrate GPT with Instagram and WhatsApp. No solutions were offered in this message rundown.
GPT Discussions: Various discussions on the workings of GPTs took place. @ankur1900 sought help for a problem of matching user-input sentences with the dataset using Assistant API. Meanwhile, @quacky.games explored the possibility of creating chats with multiple roles, and @wilhaim questioned GPT's benefits.
VPN and Browser Issues: @apelambo reported issues with using GPT on desktop browsers but not on mobile, which resulted in a discussion of VPNs and their potential impact on ChatGPT use. It was discovered that even an inactive VPN (@apelambo uses NordVPN) could cause issues.
Uploading Documents to ChatGPT: Users @dracount, @samuleshuges and @bazzingabalcony discussed the possibility and process of uploading documents to ChatGPT for rewriting or input analysis.

▷ Channel: gpt-4-discussions (86 messages):

ChatGPT customizations: Many users shared their experiences customizing ChatGPT, including the use of symbolic placeholders (@docwobble), APIs and integrating with Google Docs (@loschess & @woodenrobot), and instructing the model to provide summaries of varying lengths (@duckbow).
GPT-4 issues: Several users expressed frustrations with GPT-4. Discussion ranged from inconsistency in behavior, such as not referring to provided sources over its default knowledge (@pietman), to network errors and unresponsive states across different versions (@helloaang, @XXMINECRAFTGODXX, @strangeknoll, @pietman, @mirel).
Sharing custom GPTs: Users shared their custom GPTs for feedback and review. @martinr_33972 invited potential investors and software developers to check out his GPTs, and @drkuberansrinivasan_02977 was trying to find how to view comments on his shared GPTs. @ajlbs unveiled his GPTs based on spirituality and wisdom of the Bhagavad Gita.
Seeking help: Various users asked for advice or troubleshooting help, including @jrvra looking to make a GPT for reviewing journal papers, @_vincent32 wanting a professional email marketing GPT, @mom_h8_my_guns__ needing help fine-tuning a personal GPT, and @wilhaime curious about the benefits of GPTs.
API and Plug-ins: @ionknowu asked if they could use their custom GPT for a personal project through an API, but @elektronisade clarified that OpenAI ChatGPT is isolated from API access, and the Assistant API must be used instead. @Lo Mein was interested in integrating plugin functions into custom GPTs, and @woodenrobot provided a link to the documentation and confirmed there are useful posts on the forums for further assistance.

▷ Channel: prompt-engineering (22 messages):

Understanding Complex Symbols: User @mysticmarks1 highlighted the utility of understanding how GPT-3 interprets symbols such as {}, [], and () in different contexts like programming, mathematics, and textual roleplay. This understanding can facilitate better system mixed-code natural language formats.

Stable-Diffusion Prompt Crafter: @korner83 built a GPT for prompts generation named "Stable-Diffusion Prompt Crafter". The tool also supports weighted prompts and wildcards. The Reddit thread can be found here.

Discussion on Sharing Links: @madame_architect asked if sharing external links, such as to arxiv or Reddit, is now acceptable. User @eskcanta responded by explaining that while it can still be considered risky, linking to key articles and discussing the main points can invite more interaction.

Challenges with Interpretation of Transcripts: User @greenysmac is facing issues with prompts against a pair of transcripts with three speakers each, which they've added to GPT-3 and ChatGPT. The AI is inconsistent with the number of speakers it detects. To troubleshoot, @solbus suggested that this could be due to limitations in LLMs' ability to perform simple counting tasks without external tools.

Identifying GPT-4 Model Version: @hoodiewoody inquired about how to identify the version of their GPT-4 model. User @fran9000 suggested referring to the search box in a specific channel to filter the list of GPTs.

▷ Channel: api-discussions (22 messages):

Understanding Symbols in Different Contexts: @mysticmarks1 shared a detailed explanation on how GPT-3 interprets the symbols [], {}, and () in various contexts – programming, mathematics, and text annotation or roleplay. This was found to be extremely useful information by the channel members.
Creating a GPT for Prompt Generation: @korner83 made a GPT for prompt generation that supports weighted prompts and can use wildcards. He shared the link to the GPT and invited others to give it a try.
Issues with Comments Visibility on Shared GPT: @drkuberansrinivasan_02977 expressed a concern about not being able to see comments on the GPTs that he shared publicly. 
Sharing ArXiv Links: @madame_architect asked if it's allowed to share ArXiv links on the channel. @eskcanta informed him that though it's still uncertain, significant segments from such articles can be shared for discussion. @eskcanta stressed the importance of seeking review and help from the moderators if faced with a significant hit from AutoMod.
Prompt Testing Issues: @greenysmac shared issues he was experiencing when testing prompts using ChatGPT in Playground on current GPT4, Poe on GPT4 and a GPT build. He noted inconsistencies with the GPT recognizing and recalling the number of speakers in a given transcript. @solbus speculated that this could be due to language models' inability to perform mathematical tasks, like counting, reliably.

LangChain AI Discord Summary

Discussion centered around Chatbot Troubleshooting, including code and environment variables for a vector database tool and agent execution. User a404.eth had issues beyond basic greetings with their chatbot, sharing Python code and environment configuration for peer advice.
Query on integrating OpenAI GPT-4 with RAG was raised by jupiter.io, specifically on modifying the weighting affecting RAG's data selection. haste171 recommended modifying search_k in the vector retriever's query kwargs as a possible solution.
wolfwood1862 encountered missing import issues, unable to import ApifyWrapper from langchain. To resolve, haste171 suggested referring to the API reference or updating the LangChain package.
Inquiry on AI Chatbot Data Handling for multiple input types (text, image, etc.) was raised by sid.pocketmail. This included interest in a GPT-4-vision API for a chatbot and handling image retention for related user questions.
Series of miscellaneous questions raised, including running "OpenAIAssistantRunnable" as an agent (katerinaptrv), acquiring the client_id for Confluence (._.nobody._._), and resolving 503 responses with Azure OpenAI (_pabloe).
Miscellaneous notes on AI and project updates from claragrey, jupiter.io, and fran.abenza.
@veryboldbagel discussed Langchain Experimental Features, including agents for handling panda, CSV, and SQL data. @steve675 expressed data storage concern, as their data resides in a different database, the Kustoclient.
Tips on input keys setup for better functioning by @liminalstvte and subsequent technical issues related to qaTemplate and usage of prompt inside qchainoption{} by @menny9762.
A shared project named Seesaurus, a 3D visualization tool for English words, showcased by @recurshawn. Seesaurus allows users to visualize word relations in a 3D space: Seesaurus
A quote from Sam Altman on the perspective of technological growth shared by @rpall_67097: "When you are standing on the exponential curve of technology, it looks flat behind you and vertical in front of you, but it's just a curve."

LangChain AI Channel Summaries
▷ Channel: general (53 messages):

Chatbot Troubleshooting: a404.eth shared an issue they experienced with their chatbot where it worked fine with simple greetings but faced problems when trying to use a specific "tool". They then shared a large block of Python code along with its relevant imports which define functions for creating a vector database tool, parsing source documents, and executing the agent. They also shared additional information about the environment variables being used for the configuration of chatbot.
OpenAI GPT-4 Integration with RAG: jupiter.io asked about the possibility of controlling a weighting that affects how much RAG chooses of its original data compared to the custom data. haste171 suggested that search_k in vector retriever's query kwargs can be modified but was not completely sure. They further clarified how setting a different temperature value impacts the generativity of the AI.
Missing Import Issues: wolfwood1862 encountered errors when trying to import certain modules (specifically, ApifyWrapper) from langchain. haste171 advised referring to the API reference or updating the LangChain package to resolve these issues.
AI Chatbot Data Handling: sid.pocketmail inquired about an open-source project similar to ChatGPT, which handles multiple inputs like text, image, etc., with agents in production. In a similar thread, he expressed interest in implementing the GPT-4-vision API in his chatbot and asked how to retain an image in memory so that users could pose related questions.
Miscellaneous Questions: Several other specific queries were made during the conversation, including katerinaptrv's query about running "OpenAIAssistantRunnable" as an agent using the "code_interpreter" tool of OpenAI, ._.nobody._._'s query about getting the client_id for Confluence, and _pabloe's concern about encountering 503 responses while using Azure OpenAI.
Miscellaneous Notes: claragrey noted that they have deployed an endpoint using LLM and LangChain’s ConversationChain, jupiter.io was interested in AIs that write by themselves, and fran.abenza expressed frustration at being on a waiting list for months.

▷ Channel: langchain-templates (7 messages):

Langchain Experimental Features: @veryboldbagel mentioned that Langchain experimental offers agents for handling pandas, csv, and SQL data.
Data Storage Concern: @steve675 indicated that their data is stored in a different database, specifically Kustoclient.
Input Keys: @liminalstvte advised setting input keys in the prompt for better functioning.
Technical Issues: @menny9762 reported that they encountered a problem which was not resolved even after setting input keys. Later, they discovered that the qaTemplate is deprecated in javascript. As a result, they switched to using prompt inside qchainoption{} which resolved the issue.

▷ Channel: share-your-work (1 messages):

Seesaurus - A 3D Visualization of English Words: @recurshawn shared a project they developed called Seesaurus. It's a 3D visualization of English words where users can see how words relate to each other in a 3D space, add words, and form clusters of words. This project was made with intentions of seeing how language models perceive our words. Supports all devices but large screen is recommended.

    - Seesaurus link: https://seesaurus.com/
    - Hacker News launch link: https://news.ycombinator.com/shownew

▷ Channel: tutorials (1 messages):

Perspective on Technological Growth: @rpall_67097 shared a quote from Sam Altman about the exponential curve of technology growth: "When you are standing on the exponential curve of technology, it looks flat behind you and vertical in front of you, but it's just a curve."

Nous Research AI Discord Summary

Extensive discussion on the projected size and training process of GPT-4. Users guessed that OpenAI likely had 5T or 6T unique tokens, with best results achieved with 2 epochs for text, and 4 epochs for code-based text. 
Hypotheses regarding the structure of GPT-3.5 Turbo and GPT-4 Turbo, where it was suggested they may have a smaller structure than expected and might use sparser attention mechanisms. Specific figures of below 80 billion and even 50 billion parameters were discussed for Turbo. 
Comparative test results shared for evaluating performance limitations of Claude and GPT-4 Turbo models. A link was shared to Greg Kamradt's Twitter with the Test Results.
The differences between data precision types fp32 and bf16 probed, with a link shared to a LinkedIn article for further understanding.
Mention of the Chinese TigerBot model from TigerResearch available on huggingface. Detailed information was unclear due to language constraints.
Memory management and initial delays in loading models were explored. It was clarified that mmap doesn't necessarily increase loading speed but decreases initial delays before prompting.
Limits and challenges associated with GPT model size, generation speed, and vRAM usage were discussed and considered in relation to HuggingFace implementation.
Announcements and discussions about new large language models, especially focusing on Chinese LLMs (Yuan 2.0), its unique size (2b, 51b, 102b), and license terms.
Questions raised about Yuan 2.0's benchmarking methods, specifically the use of translating and rephrasing questions, with suspicions of potential information leaks.
Release of Perplexity.ai online LLMS and their pricing details shared by user @atgctg with links to the official blog post and pricing page.
Estimation of AI Training costs, with a comparison of hourly rates on various platforms like h100, runpod, AWS, GCP and consideration of SXM5 integration.
Sharing of resources like Oasis Corpus dataset, github repository for prompt lookup decoding and interesting tweets on experimentation and knowledge distillation.
Announcement about OpenHermes 2.5 inclusion in LMSys' ChatBot Arena for testing and comparison, with an invitation to test various models blind via chat.lmsys.org.

Nous Research AI Channel Summaries
▷ Channel: benchmarks-log (1 messages):
Since only a single partial message from user @qnguyen3 is provided with no substantive content, there isn't enough context to provide a summary. Moreover, no relevant topics, user interactinns or discussions were identifiied in the message history.
▷ Channel: interesting-links (7 messages):

Cost of AI Training: @nonameusr expressed surprise at the $2 hourly rate for the h100. @coffeebean6887 explained that this might be a starting rate, likely increasing for shorter reservations, but considered competitive, especially given the SXM5 integration. They compared it to higher rates found on other platforms like runpod, AWS, and GCP.
Tweet on Experimentation: @metaldragon01 shared a tweet from @SebastienBubeck on experimentation.
NLP Dataset: @euclaise shared a link to the Oasis Corpus dataset on Huggingface.
Tweet on Knowledge Distillation: @atgctg shared a tweet from @eugeneyan that discusses knowledge distillation.
Prompt Lookup Decoding: @Fynn shared a link to a repository that implements speculative decoding by string matching parts of the prompt, which could be effective for input-grounded tasks.
LLM Synthetic Data Blog: @atgctg shared a blog post discussing Language Model-Led Movement (LLM) and synthetic data generation.

▷ Channel: announcements (1 messages):

OpenHermes 2.5 Testing on ChatBot Arena: @teknium announced that OpenHermes 2.5 has been included in LMSys' ChatBot Arena for testing and comparison. Users are invited to "[g]o and test out several models and compare them blind to determine who is the best!" The website for testing can be visited at https://chat.lmsys.org/.

▷ Channel: general (165 messages🔥):

GPT-4 Training Observations: Users @ldj and @giftedgummybee had an insightful discussion about the training process and size of GPT-4 models. They deduced that OpenAI likely had 5T or 6T unique tokens, achieving best results with 2 epochs normal text and 4 epochs code text. However, the exact parameters of recent models weren't clear, leading to mere estimations and assumptions about the size and training tactics.
Exploring Turbo Models: @ldj and @giftedgummybee hypothesized that GPT-3.5 Turbo could be less than 80B or even below 50B. They considered Turbo a toy product, offering cheap hosting and as a tool to counter open source projects. Moreover, they speculated the attention mechanism might be sparser in Turbo models. 
Evaluation of Claude & GPT-4 Turbo: User @coffeebean6887 posted test results from Greg Kamradt comparing the limitations of Claude and GPT-4 Turbo. They demonstrated different performances and limitations regarding token count and recall behaviors for these models. Test Results Cluade vs GPT-4.
Differences Between Data Precision Types: User @papr_airplane asked about the differences between fp32 and bf16 in fine-tuning models. @yorth_night provided a helpful LinkedIn article explaining these differences.
Tigerbot Discussion: Users mentioned the TigerBot model from TigerResearch on huggingface, however, the responses were varied and the detailed information wasn't easily accessible due to language constraints.

▷ Channel: ask-about-llms (77 messages):

Discussion on Memory Management and Loading Models: Users @variav3030, @giftedgummybee, @coffeebean6887, @teknium, and @russselm discussed the memory management and loading models. Particularly, @russselm clarified that mmap doesn't necessarily speed up the model loading process, but instead loads data on demand which helps decrease initial delays before prompting the model. 
Concerns about GPT Model Size: @coffeebean6887, @teknium, and @russselm discussed some theoretical limits and challenges related with GPT model sizes and generation speed, particularly in context of HuggingFace implementation and vRAM usage.
Release of Chinese LLMs (Yuan 2.0): Users @.benxh, @yorth_night, @coffeebean6887, and @teknium discussed the recent release of Chinese-language large language models, specifically, Yuan 2.0. The model's unusual sizes (2b, 51b, and 102b) and the unique license terms were noted as interesting aspects.
Discussion on Yuan 2.0's Benchmarks: @coffeebean6887 and @yorth_night questioned some methods used in Yuan 2.0's benchmarking, specifically, translating and rephrasing of questions. It was suggested that the translation process might lead to information leaks and could potentially invalidate the test.
Release and Pricing of Perplexity.ai Offering: User @atgctg shared links to Perplexity.ai's latest blog post introducing their online LLMS and their pricing, but realized that it might have been shared in the wrong channel.

Alignment Lab AI Discord Summary

Discussion in the AI and ML discussion channel revolved around RAG paragraphs and Hybrid Search. @adi_kmt referenced the common standard for RAG, where paragraphs are based on max tokens with an appended subheading for additional context. Also identified was the use of semantic and embedding searches combined in a technique called hybrid search that often involve re-ranking of results with tools like BM25 or SPLADE.
    - Resource links shared for further delving: 
        - Pinecone Hybrid Search
        - Production RAG
The General chat channel saw diverse discussions:
    - @magusartstudios on making some progress with an undisclosed project.
    - @entropi shared a DeepMind blog post detailing the use of deep learning to discover millions of new materials.
    - @altryne announced a live stream featuring LDJ with a focus on the usage of WandB. Intended for ML beginners, a link to the live stream was made available.
Over in the oo channel, @lightningralf provided a link to the Voltage Park site that features a 24,000 h100 cluster which could be of interest to the community members.

Alignment Lab AI Channel Summaries
▷ Channel: ai-and-ml-discussion (1 messages):

RAG paragraphs and Hybrid Search: @adi_kmt discussed the common standard for RAG, noting that paragraphs are generally based on max tokens with a subheading/heading appended for additional context. They also mentioned another common method of using both semantic search and embedding search in a technique called hybrid search, which often involves re-ranking results with tools like BM25 or SPLADE.
    - Links shared for further reference: 
        - Pinecone Hybrid Search
        - Production RAG

▷ Channel: general-chat (3 messages):

Work on a Project: @magusartstudios mentioned working on an unspecified project, sharing that they had made some progress.
DeepMind Discovery: @entropi shared a link to a DeepMind blog post on the discovery of millions of new materials using deep learning.
Live Stream on Machine Learning Tools: @altryne announced a live stream featuring LDJ on how to use WandB. The live stream, aimed at those new to machine learning, is scheduled to be aired on multiple platforms. A link to the live stream was provided. Additionally, @altryne encouraged TTRPG Discord users to share the live stream in the Discord, and also expressed interest in learning from users with their own WandB setup.

▷ Channel: oo (1 messages):

Voltage Park Cluster: User @lightningralf shared a link to Voltage Park, which features a 24,000 h100 cluster that could be of interest to the channel members.

Skunkworks AI Discord Summary

User @benxh initiated a discussion on investigating Yuan 2.0 in the guild.
Technical question raised by user @papr_airplane on the difference between using fp32 and bf16 when finetuning.
Project progress update in the channel ablateit-wandb-alerts, with the completion of the Weights & Biases run named autumn-pyramid-122 of the project huggingface by user @ablateit. Complete run details accessible at wandb.ai website.

Skunkworks AI Channel Summaries
▷ Channel: general (2 messages):

Yuan 2.0 Inquiry: User @benxh asked if anyone was investigating Yuan 2.0.
Finetuning with fp32 vs bf16: User @papr_airplane queried about the difference between using fp32 and bf16 when finetuning.

▷ Channel: ablateit-wandb-alerts (1 messages):

Weights & Biases Run Completion: The Weights & Biases run named autumn-pyramid-122 of the project huggingface by user @ablateit has completed. The run details can be found on wandb.ai website.

LLM Perf Enthusiasts AI Discord Summary

Discussions on the limitations and capabilities of Dedicated Instances were addressed, with @ampdot asking about the context limits and @res6969 clarifying the constraints as 8k/32k/128k. 
Introduction and feedback on Perplexity.AI's PPLX-Online-LLMs. @robhaisfield shared a blog post about the launch of models such as Mistral, codellama 34b, llama 70b, and search-powered models. @thebaghdaddy affirmed the tool's utility in topic exploration compared to Google search, and elaborated upon the process to @justahvee. 
Natural Language Systems in production discussed: @res6969 expressed concerns about prompt injection, @nosa_ brought up experiences in PII redaction, and @pantsforbirds suggested a system of chaining two models for retrieval and redaction. @nosa_ revealed successful fine-tuning of models for PII redaction, while @pantsforbirds provided opinions on GPT-4's performance in PII removal.
Questions were raised regarding fine-tuning the new tools and tool_calling introduced by OpenAI, and a query about the maximum sample size for 3.5 fine-tuning was clarified by @robertchung referencing the github openai-cookbook.
Recommendations for open-source image to text model requested by @jeffreyw128, with @__polarbear suggesting the use of native Mac OS OCR. Increasing token limit on llama2 discussed, with progressive summarization as a potential method shared by @joshcho_.
Job opportunities with cash + equity compensation was mentioned by @blakeandersonw in the channel.
Inquiry regarding suitable LLM inference framework for CPU-only AWS instances proposed by @nosa_..
Creation of a new channel in the performance category was reported by @jeffreyw128.
Performance issues: @kev.o. reported slower performance and issues using Azure.
Strategies and challenges in AI prompting: @pantsforbirds shared a GitHub link for generating system prompts for Chat-GPT and asked about handling 'null' values in GPT-4's JSON responses. @joshcho_ sought advice on controlling response length. Research and methods on enhancing GPT-4 performance for medical benchmarks were also shared: Research Tweet.

LLM Perf Enthusiasts AI Channel Summaries
▷ Channel: general (6 messages):

Context Limits with Dedicated Instances: User @ampdot inquired about the context limits with dedicated instances and received clarification from @res6969 that these limits stay the same - 8k/32k/128k.
Introduction of PPLX-Online-LLMs by Perplexity.AI: @robhaisfield shared a link to the blog post about the introduction of PPLX-Online-LLMs by Perplexity.AI. Offering models such as Mistral, codellama 34b, llama 70b, and search-powered models, the API is reported to have a super-fast response time.
Comparison to Google Search: @thebaghdaddy compared Perplexity's search tool favorably to Google search for topic exploration. According to him, Google now serves mostly for finding restaurant phone numbers.
Use of Perplexity for Topic Exploration: When @justahvee asked for clarification, @thebaghdaddy further expounded on the topic exploration with Perplexity on the free plan, explaining the process of asking increasingly specific questions and profiting from the referenced links provided by Perplexity to gain a deeper understanding of unfamiliar fields, like gene therapy.

▷ Channel: gpt4 (8 messages):

Use of Natural Language Systems in Production: @res6969 expressed skepticism about using a Natural Language System in production due to susceptibility to prompt injection, while @nosa_ shared experiences of using local models for PII redaction at a privacy startup, with potential improvement through serious fine-tuning.
Chaining Two Models for PII Redaction: @pantsforbirds suggested an idea of chaining two models, the first one for retrieval and the second one for redacting PII with a hard-coded prompt. 
Fine-Tuning Models for PII Redaction: @nosa_ clarified that they have successfully fine-tuned models for PII redaction, and proposed that a well-finetuned model could generalize on the notion of "private info" that's more general than static PII, despite current poor results. 
GPT-4 and PII Redaction: @pantsforbirds stated that GPT-4 does a decent job at removing PII, but achieving only 85-90% isn't satisfactory. The user speculated about the possibility of seeing chained model queries to navigate prompt engineering attempts.
GPT-4 Performance: @potrock shared a Twitter link about hopeful comments from OAI staff on lazy GPT-4. @.psychickoala noticed a slowdown in Azure OpenAI calls.

▷ Channel: finetuning (3 messages):

Fine-tuning for new tools and tool_calling: @robertchung asked if anyone has done fine-tuning for the new tools and tool_calling aspects of OpenAI yet.

Long context samples in 3.5 Fine-tuning: @hassantsyed asked for clarification about the maximum sample size for 3.5 fine-tuning, referencing the OpenAI docs and noting a discrepancy between the stated 4k max sample size and a later statement supporting up to 16k context examples. He provided a link to the token-limits guide.

Answer to Long Context Samples Confusion: @robertchung responded to @hassantsyed's question, indicating that he found the answer on the github openai-cookbook page.

▷ Channel: opensource (4 messages):

Best Opensource Image to Text Model : @jeffreyw128 asked for recommendations for the best opensource image to text model because of issues with gpt4-vision. @__polarbear recommended using native Mac OS OCR.
Increasing Token Limit on llama2 : @dongdong enquired about increasing the token limit on llama2. A potential workaround suggested by @joshcho_ is to use progressive summarization.

▷ Channel: collaboration (1 messages):

Job Opportunities: User @blakeandersonw mentioned opportunities for part/full time work with cash + equity compensation.

▷ Channel: speed (1 messages):

LLM Inference Framework Query: User @nosa_. asked for recommendations on a favorite LLM inference framework specifically for CPU-only AWS instances.

▷ Channel: feedback-meta (1 messages):

New Channel Creation: User @jeffreyw128 reported the creation of a new channel with the ID #1179271229593624677 in the performance category.

▷ Channel: openai (2 messages):

Slower Performance: @kev.o. reported that they are still experiencing slower performance than usual.
Issue on Azure: @kev.o. also reported further issues, experiencing some difficulties while using Azure.

▷ Channel: prompting (8 messages):

System Prompt Generation for Chat-GPT: @pantsforbirds shared a link to a GitHub resource that illustrates an approach to generate system prompts for Chat-GPT, available at LouisShark/chatgpt_system_prompt.
Issue with null Values in GPT-4's JSON Responses: @pantsforbirds asked for advice on managing the issue of 'null' strings appearing in the JSON responses from GPT-4.
Controlling Response Length in Conversational Agents: @joshcho_ sought advice on how to prevent characters from providing extensive responses when asked for information. @dongdong suggested incorporating a character or word count limit in the prompts.
Promoting for Medical Benchmarks: @pantsforbirds discovered and shared research on advanced prompting techniques that improve the performance of GPT-4 for medical benchmarks. The result mentioned in the tweet is particularly significant: Research Tweet
Interpretation of Medical Task Method: @thebaghdaddy offered insight on the medical task method, stating it relies on text embedding. They appreciated the shared information and indicated plans to conduct tests based on it.

MLOps @Chipro Discord Summary

Shared community events on platforms: @jaskirat posted a link to a Luma event in the event channel.
Discussion on anti-spam measures: User @gramaras referenced numerous anti-spam measures in the context of a conversation within the general ML channel.
Knowledge sharing on SIMD Base64: @mattrixoperations shared an informative article on SIMD Base64 in the general ML channel.
Employment opportunities: @wangx123, the CEO of a start-up, extended an invitation to anyone interested to join their company in the general ML channel.

MLOps @Chipro Channel Summaries
▷ Channel: events (1 messages):

Event Link Posted: @jaskirat shared a link to an event on the platform Luma.

▷ Channel: general-ml (3 messages):

Anti-spam Measures: User @gramaras mentioned the existence of numerous anti-spam measures in the context of a conversation.
SIMD Base64: @mattrixoperations shared an informative article on SIMD Base64.
Job Opportunity at a Start-Up: @wangx123, the CEO of a start-up, invited those interested to consider joining their company.

The Ontocord (MDEL discord) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

The AI Engineer Foundation Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

Perplexity AI Discord Summary
Only 1 channel had activity, so no need to summarize...
Perplexity AI Channel Summaries
▷ Channel: announcements (1 messages):

Announcement of pplx-api coming out of beta and move to usage-based pricing: User @ok.alex informed that Perplexity's pplx-api is now out of beta and will be moving to usage-based pricing. This includes the live LLM APIs that are grounded with web search data and have no knowledge cutoff. More details on http://pplx.ai/online-llms.

Introduction of the "online" models: The new models, pplx-7b-online and pplx-70b-online, have been trained in-house by building on top of Mistral and Llama 2, and fine-tuned for accuracy and helpfulness. @ok.alex stated these models are believed to surpass GPT-3.5 and Llama 2 in answering questions with search grounding.

Provision of pplx-api: With pplx-api coming out of beta, users can now access pplx-online, pplx-chat, and open-source LLMs like Mistral and Llama 2 with Perplexity's in-house infrastructure, on usage-based pricing. Perplexity 👁 users are entitled to a $5 monthly credit. Inquiries should be directed to api@perplexity.ai.

Announcement about AWS re:invent keynote speech: Perplexity's team will be on the keynote stage at the Amazon Web Services (AWS) re:invent to shed more light on this. The keynote can be viewed at https://go.aws/3GbYl4U.

The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
x

Don't miss what's next. Subscribe to AI News (MOVED TO news.smol.ai!):