[AINews] AI Discords Newsletter 11/4/2023

        November 4, 2023

[AINews] AI Discords Newsletter  11/4/2023

This is AI News! an MVP of a service that goes thru all AI discords/Twitters/reddits and summarizes what people are talking about, so that you can keep up without the fatigue. Signing up here opts you in to the real thing when we launch it 🔜

        Guild: Latent Space
Latent Space Guild Summary

Discussing the challenges involved in achieving Artificial General Intelligence (AGI); initiated by @binary6818.
Distribution of articles related to AI from The New Stack by @swyxio:
Article on How to fill the 27 million AI Engineer Gap
AI Engineer Summit Wrap-up and Interview with co-founder Swyx
Article on The AI Engineer Foundation: Open Source for the Future of AI

Coverage on OpenAI's Massive ChatGPT Updates Leak shared by @swyxio.
OpenAI's Massive ChatGPT Updates Leak Ahead of Developer Conference

Discussion around Devday Leaks and Rumors.
Query by @stealthgnome for papers similar to Anthropic's Constitutional AI, regarding methods to steer LLM to behave in a particular way.
@swyxio showed interest in similar papers to Anthropic's.

Latent Space Channel Summaries

### Channel: [ai-general-chat](https://discord.com/channels/822583790773862470/1075282825051385876)

Summary (2 messages): 

AI General Discussion Summary:

    Challenges to Achieve AGI: @binary6818 initiated a discussion topic about the challenges in achieving Artificial General Intelligence (AGI).
    New Stack Coverage: @swyxio shared multiple articles from The New Stack related to AI, including a wrap-up of an AI Engineer Summit and a discussion about the AI Engineer Foundation. 

        Article on How to fill the 27 million AI Engineer Gap
        AI Engineer Summit Wrap-up and Interview with co-founder Swyx
        Article on The AI Engineer Foundation: Open Source for the Future of AI

    OpenAI's Massive ChatGPT Updates Leak: @swyxio posted a link to an article titled "OpenAI's Massive ChatGPT Updates Leak Ahead of Developer Conference."
    Devday Leaks and Rumors: @swyxio brought forth a topic about leaks and rumors surrounding an upcoming developer day.

### Channel: [llm-paper-club](https://discord.com/channels/822583790773862470/1107320650961518663)

Summary (2 messages): 

Discussion on Similar Papers to Anthropic's Constitutional AI:

    @stealthgnome asked if there are any similar papers to the "Anthropic's Constitutional AI" where methods are applied to steer the LLM to behave in a certain manner.
    @swyxio also showed interest in similar papers to Anthropic's. 

---

## Guild: [OpenAI](https://discord.com/channels/974519864045756446)

### OpenAI Guild Summary

- **GPT Chatbot Customization**: Discussion on customizing the chatbot's personality using the "Custom Instructions" feature. 
- **AI Startups**: Conversation on OpenAI's stance on developing new AI products for startups with suggestions to look up OpenChat.
- **Additional Information Input**: Method shared by user @omeranzar for inputting additional information into fine-tuned AI model.
- **Access to Alpha Features**: Community discussion on challenges in gaining access to Alpha features for users without Alpha access.
- **Prompt Engineering**: Exploration of strategies for effective prompting with GPT-4, suggesting presenting tasks one at a time for optimal performance.
- **ChatGPT Performance and Limitations**: Debate on the capabilities and constraints of ChatGPT, including suggestions on how to manage known performance issues.
- **Support and Help**: Users sought help from OpenAI Support, with multiple users reporting issues accessing the help portal.
- **Debate on Access to New Features**: Dissatisfaction among users over not having early access to new features even as paid subscribers.
- **Discussion on Personal Circumstances**: Discussion on personal circumstances such as homelessness and ways to increase ChatGPT limits for better integration in daily life.
- **Knowledge Cut-off Date**: Conversation concerning the knowledge cut-off date for various models of GPT.
- **Access problems of OpenAI services**: Users reported difficulties with accessing different OpenAI services, with community suggestions and solutions provided.
- **Usage of OpenAI in multi-user scenarios**: Discussions about potential risks associated with multi-user usage of a single ChatGPT account, emphasizing potential violation of Terms of Service.
- **Semantic understanding and limitations of ChatGPT:** Discussion on GPT Plus's proficiency in image scene identification but limitations in text extraction from images.
- **Usefulness of GPT Plus subscription**: Questions and discussion about the benefits of GPT Plus subscription, with a focus on academic purposes.
- **Queries regarding 'All Tools'/Alpha version**: User queries about the access and benefits of the 'All Tools' or Alpha version of ChatGPT. Users directed to follow specific Discord channels for updates.
- **Discussion about Terms of Use and account suspension**: Concerns about possible account suspension risks when sharing an account with multiple users and potential violation of OpenAI's Terms of Service.
- **Discussion on Python code execution issue**: Discussion on technical issue related to Python code execution.
- **Commercial use of DALL-E generated images**: Discussion about the commercial use of 3D images created with DALL-E; users share OpenAI policy.
- **Access to GPT-4 Alpha**: Reports of non-Plus users being able to access GPT-4 Alpha, with concurrent mention of a problem with image generation.
- **Image Text Recognition**: Users engage in a point-counterpoint scenario about GPT Plus's ability to recognize text within images.
- **Function Usage in OpenAI NPM Package**: Request for guidance on using functions within the OpenAI NPM package without direct assistance offered in chat.

OpenAI Channel Summaries

### Channel: [ai-discussions](https://discord.com/channels/974519864045756446/998381918976479273)

Summary (4 messages): 

AI-Discussions Summary:

    GPT Chatbot Customization: @lavender05 asked about customising the chatbot's personality, to which @solbus advised using the "Custom Instructions" part of the app/website.
    AI Startups: @world_designer sought a source for information on OpenAI's position on developing new AI products for startups. @drinkoblog.weebly.com suggested looking up OpenChat as a possible source.
    Additional Information Input: @omeranzar shared a method for inputting additional information into the fine-tuned model via system context within chat, with preliminary successes with Retrieval Augmented Generation.
    Access to Alpha Features: There was discussion between @leua61, @world_designer and others about gaining access to Alpha features. The consensus was that it is not possible if you don't already have access, except for the All Tools feature, which is still in the Alpha stage.
    Prompt Engineering: @eskcanta and others discussed strategies for effective prompting with GPT-4, recommending presenting tasks one at a time rather than in bulk for better performance.
        Link: Reddit post suggested by @hydverse for more insight.

### Channel: [openai-chatter](https://discord.com/channels/974519864045756446/977697652147892304)

Summary (4 messages): 

OpenAI Discord Chatbot Message Summary:

    ChatGPT Performance and Limitations: Various users discussed many facets of ChatGPT's capabilities and limitations, including instances of the AI not meeting expectations (e.g., 'getting back to user', generating images, misspelt words). There are suggestions to manage the known performance issues, like using model 3.5 for tasks it's good at and using model 4 for tasks that 3.5 isn't.
    Support and Help: Users were trying to reach out to OpenAI Support for various reasons, with some facing issues in accessing the help portal help.openai.com. Direct links and alternate channels of support were provided.
    Debate on Access to New Features: Users have expressed dissatisfaction over not receiving early access to new features despite being paid subscribers, referencing OpenAI's promise of 'priority access to new features and improvements' for paid users.
    Discussion on Personal Circumstances: User @nomaddad discussed their personal situation of homelessness and sought to increase their ChatGPT limits for better integration in day-to-day life. Users offered advice and workarounds to manage within existing limits.
    Knowledge Cut-off Date: Users discussed and sought confirmation on the knowledge cut-off date for different models of GPT (Mainly GPT 3.5 and GPT 4), with mentions of dates ranging from January 2022 to April 2023.

### Channel: [openai-questions](https://discord.com/channels/974519864045756446/974519864045756454)

Summary (4 messages): 

Summary: Discord Channel OpenAI-Questions:

    Access problems of OpenAI services:: Multiple users (@jaaf_studio, @Bray, @x.E, @Top J, @whackscript, @tgbrkdlbz, @moniqueg, @cowhimself, @wwwidonja, @askanhelstroem) reported various problems about accessing or using different OpenAI services - from access to specific features like DALL-E or web connection in chat to possible issues about subscriptions, problems accessing with different devices or browsers, errors thrown or downloading issues. Multiple suggestions and solutions were proposed by @eskcanta, @smilebeda, @solbus among others, including suggestions to reach out to official support or dev channels of OpenAI.

    Usage of OpenAI in multi-user scenarios:: User @zhengyuancheng discussed the possibility of multi-user usage of a single ChatGPT account and potential risks associated with it. Other users, including @solbus and @eskcanta, provided responses emphasizing the potential violation of Terms of Service and possible limitations due to usage caps.

    Sematic understanding and limitations of ChatGPT:: User @zhengyuancheng also discussed about the proficiency of GPT Plus in image scene identification but its limitations in text extraction from images.

    Usefulness of GPT Plus subscription: @ykaiser asked about the benefits of GPT Plus subscription, especially for academic purposes, which led to a discussion highlighting the possibility of a larger context window in the future.

    Queries regarding 'All Tools'/Alpha version:: Several users asked about the access and benefits of the 'All Tools' or Alpha version of ChatGPT. Access information was provided by OpenAI Dev SOLBUS, directing users to follow specific Discord channels for updates.

    Discussion about Terms of Use and account suspension:: @zhengyuancheng sparked a discussion about possible account suspension risks when sharing an account with multiple users. @solbus and @eskcanta reinforced that this approach could violate OpenAI's Terms of Service.

    Discussion on Python code execution issue:: @primus727 has an issue with Python code execution in an Advanced data analysis chat, @eskcanta helped figure out the issue.

### Channel: [gpt-4-discussions](https://discord.com/channels/974519864045756446/1001151820170801244)

Summary (4 messages): 

Summary of gpt-4-discussions:

    Commercial use of DALL-E generated images: @wintre inquired about the possibility of commercially using 3D images created with DALL-E. @zachisuppose responded by sharing the OpenAI policy about the same, stating that the users own the images they create with DALL-E with the rights to reprint, sell, and merchandise, complying to the Content Policy and Terms of the company. Relevant link: https://help.openai.com/en/articles/6425277-can-i-sell-images-i-create-with-dall-e

    Access to GPT-4 Alpha: @the_lemon_man reported unexpected access to GPT-4 Alpha as a non-Plus user. There was, however, no concrete explanation provided as to why this might have occurred. Also mentioned was a problem with image generation, indicating that DALL-E 3 is malfunctioning.

    Image Text Recognition: @zhengyuancheng pointed out a proficiency gap in GPT Plus, where it can recognize scenes and content within pictures effectively but struggles to recognize text within images in a similar manner. @qacona shared a countering experience of successful text recognition from a document image when they explicitly asked the chatbot to read the text. 

    Function Usage in OpenAI NPM Package: @m1.js asked for guidance on how functions work within the OpenAI NPM package. However, no direct answer or assistance was offered in this chat excerpt. 

---

## Guild: [LangChain AI](https://discord.com/channels/1038097195422978059)

### LangChain AI Guild Summary

- **Improvisations for LangChain AI**: An idea of sending user messages in separate chains to improve the accuracy of GPT chat models. This method specifically asks the model about tool requirements.

- **LangChain AI Python Code Request**: There was a request for a python code to create a LangChain AI chatbot with specific functions and tools. The requirements specified are the usage of the AZURE chat model, ability to search online using serpapi, and answering questions with referenced sources through the Chroma vector database.

- **Challenges and Guidance in LangChain.js**: Some users sought help in modifying agent prompts and building 'openai-functions' type agent with a retriever tool in langchain.js. They were interested in instructional materials or tutorials for the same.

- **Issues with `Langchain serve` command**: During discussions around technical issues and their solutions, it was clarified that `langchain serve` doesn't automatically load env vars from a .env file because neither langchain or langserve includes dotenv as a dependency. The solution was to export them before running the command or modifying the scripts with `dotenv.load_dotenv()`.

- **Pirate Speak Playground Server Error**: One member mentioned a persistent 'Not Found' error while trying to access a specific URL, leading to a discussion about cross-verifying the version of Langserve used and the potential solution of opening an issue in the langserve repo.

- **LangChain & OpenAI Python Client Library Wrapper Project**: A member presented their project, a wrapper developed on top of the LangChain & OpenAI python client library. The wrapper was designed to improve the handling of TPM & RPM headers from OpenAI and the utilization of multiple keys. Related resource mentioned: [https://pypi.org/project/langchain-openai-limiter/](https://pypi.org/project/langchain-openai-limiter/)

- **Query regarding LangChain agent**: One member queried about LangChain agent utilization, with advice from another member to use the functions concept. A relevant example was shared: [https://github.com/pinecone-io/langchain-retrieval-agent-example](https://github.com/pinecone-io/langchain-retrieval-agent-example)

- **Help requests regarding LangChain errors**: Certain users sought help with LangChain errors. However, no solutions or advice were provided in the reviewed messages.

LangChain AI Channel Summaries

### Channel: [general](https://discord.com/channels/1038097195422978059/1038097196224086148)

Summary (4 messages): 

LangChain AI Discord Chatbot Messages Summary:

    Improvising LangChain AI Usage: @dent offered a workaround to improve the accuracy of GPT chat models by sending user messages in separate chains and specifically asking the model about tool requirements.

    LangChain Python Code Request: @noureldin_93431 requested a Python code to create a LangChain AI chatbot with specific functions and tools. The chatbot should use the AZURE chat model, search online using serpapi, and answer questions with referenced sources through the Chroma vector database.
    Need Guidance for LangChain.js Usage: @mughi_94675 is seeking help in modifying the agent prompt post tool observation while building an 'openai-functions' type agent with a retriever tool in langchain.js. They were interested in instructional materials or tutorials for this purpose.
    Query about RetrievalQA.from_chain_type : @brio99 raised a concern about being unable to modify the prompt for the map_reduce method while using RetrievalQA.from_chain_type. They pointed out the lack of documentation on these steps.

### Channel: [langserve](https://discord.com/channels/1038097195422978059/1170024642245832774)

Summary (4 messages): 

Technical Issues & Solutions:

Langchain `serve` command issue: @xleven clarified that `langchain serve` will not automatically load env vars from .env file since neither langchain or langserve use dotenv as a dependency. To resolve this, they suggested exporting them before running the command or modifying the scripts with `dotenv.load_dotenv()`.
Pirate Speak Playground Server Error: @attila_ibs encountered a 'Not Found' error while trying to access http://127.0.0.1:8000/pirate-speak/playground/. The issue persisted with HTTP 500 error throwing a 'UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 163499: character maps to ' when set up manually.
Solution Suggestion & Verification of Langserve version: @veryboldbagel suggested verifying the version of langserve being used and opening an issue in the langserve repo if the problem persists with the most recent version. @attila_ibs confirmed they are using the most recent versions of Langserve (0.0.22) and Langchain (0.0.330).

### Channel: [share-your-work](https://discord.com/channels/1038097195422978059/1038097372695236729)

Summary (4 messages): 

LangChain & OpenAI Python Client Library Wrapper:

    @Pino_4321 shared their finished project, a wrapper developed on top of the LangChain & OpenAI python client library. This wrapper aims to improve two key areas:

            The handling of TPM & RPM headers from OpenAI. The wrapper ensures that tokens and requests are available instead of consistently triggering retries.
            The utilization of multiple keys. The wrapper is designed to estimate the token requirements for requests and uniformly chooses from the keys that have adequate TPM & RPM. If none are available, it picks a random key and waits. 

     Links: 
        https://pypi.org/project/langchain-openai-limiter/

### Channel: [tutorials](https://discord.com/channels/1038097195422978059/1077843317657706538)

Summary (4 messages): 

Introduction and Assistance with Langchain Concepts:

    Utilization of Functions Concept using Langchain Agent: In response to @hamza_sarwar_'s query, @mughi_94675 directed them to use the functions concept with a LangChain agent.
    In response to @ashok_71342's plea for help with a LangChain error, no solution or advice was provided in the reviewed messages.
     Links: 
        https://github.com/pinecone-io/langchain-retrieval-agent-example: An example shared by @mughi_94675 that expounds on the functions concept with LangChain agent.

---

## Guild: [Nous Research AI](https://discord.com/channels/1053877538025386074)

### Nous Research AI Guild Summary

- **Discussions on various AI models**, with special attention to Hermes 2.5, Zephyr, the Llama 2 Chat, Vision-Flan, and Claude 2. The effectiveness of Reinforcement Learning from Human Feedback (RLHF) methods in models like Hermes 2 and Zephyr was debated. Issues with performance degradation in Mistral models were raised, with insights on token count's effects on perplexity. A user-initiated discussion highlighted the potential of Vision-Flan, while doubts about Claude 2's performance were expressed. Links to pertinent papers were shared:
    - [The Llama 2 Long Paper](https://arxiv.org/pdf/2309.16039.pdf)
    - [The ACM Paper on Divergence Concerns](https://dl.acm.org/doi/abs/10.1145/3600211.3604690)

- **Exploration of creating a hypothetical ZRAM** on MacOS by emulating it with zstd and memory arenas was undertaken. Relevant code for creating a ramdisk on MacOS was found and a link to StackExchange was shared: [Ramdisk in MacOS](https://apple.stackexchange.com/questions/461889/ram-disk-in-macos-ventura)

- Users encountered **problems with fine-tuning crashes and data manipulation**, pointing out the presence of outlier conversations with large token counts. Suggestions included filtering datasets prior to actual training.

- **Mobile usage of Large Language Models (LLMs)**, with a user detailing their successful experience of running Hermes-2.5 using the LLMfarm app on their iPhone.

- **UI design concepts in AI applications** was another talking point, with a suggestion for using the combination of GPT-4 and DALL-E for logo design.

- Interesting **datasets and analysis platforms** were shared and discussed, particularly the Vision-Flan 191-task 1k dataset on Hugging Face, the LongLoRA dataset, Function Calling Extended dataset, and the LongAlpaca-12k dataset. The utility of the dataset for fine-tuning Yarn and the variety of tasks included was noted. [Vision-Flan 191-task 1k dataset](https://huggingface.co/datasets/Vision-Flan/vision-flan_191-task_1k?row=0), [LongLoRA](https://github.com/dvlab-research/LongLoRA), [Function Calling Extended dataset](https://huggingface.co/datasets/Trelis/function_calling_extended), [LongAlpaca-12k dataset](https://huggingface.co/datasets/Yukang/LongAlpaca-12k).

- **Discussion on the capabilities of chatbots**, with focus given to building AI models with understanding of emotions and personality.

- **Practical discussions** on implementation of models with sharing of personal projects, collaborations, and explorations in technologies such as autogen and llava.

- **Miscellaneous discussions** covering a hypothetical crime scenario, advice for simple LLM training setup for beginners, comparison of Nous models in a Twitter thread and a Yarn demo, Link to a GitHub notebook on agent teaching in autogen and a YouTube video on running autogen locally. Concern over the recent OpenAI rumor on startups and product development impact due to control and customization constraint was also noted.

Nous Research AI Channel Summaries

### Channel: [ctx-length-research](https://discord.com/channels/1053877538025386074/1108104624482812015)

Summary (6 messages🔥): 

Emulating ZRAM on MacOS:

    Emulating ZRAM on OSX Using Zstd and Memory Arenas: @chadbrewbaker asked if there is code to emulate zram on OSX via zstd and some memory arenas. 

    Code for Ramdisk on OSX: @chadbrewbaker found possible code for ramdisk on OSX and shared a link. 

     Links: 
        https://apple.stackexchange.com/questions/461889/ram-disk-in-macos-ventura

### Channel: [off-topic](https://discord.com/channels/1053877538025386074/1109649177689980928)

Summary (6 messages🔥): 

Off-Topic Discussion Highlights:

    Fine-Tuning Crashes and Data Manipulation Issues: @yorth_night expressed struggles with his fine-tuning crashing and observed data manipulation issues, identifying a 235k token outlier conversation as the cause. Further, he mentions it was crashing at the tokenization phase due to padding to max length.

    Similar Experiences: @youngphlo mentioned encountering a similar problem a few months ago involving a row with over 100k tokens.

    Dataset Filtering Advice: @giftedgummybee recommended always filtering datasets before initiating actual training.

    Mobile LLM Usage: @tsunemoto shared experiencing good speed and minimal quality loss running Hermes-2.5 off his iPhone using the LLMfarm app. He humorously suggested it could be a survival tool if ever stranded on an island with only a solar power bank and phone with a LLM.

    Application Logo Design LLM: @iamgianluca asked for multimodal LLM suggestions to design a logo for his pet project application. @teknium suggested using a combination of GPT-4 and DALL-E.

### Channel: [interesting-links](https://discord.com/channels/1053877538025386074/1132352574750728192)

Summary (6 messages🔥): 

Discussion on Vision-Flan and Other AI Models:

    Vision-Flan 191-task 1k Analysis: @euclaise shared the link to the Vision-Flan 191-task 1k dataset on Hugging Face. @cybertimon expressed interest in seeing this model being quantized and observed that based on benchmarks, it seems to outperform Falcon 180b.
    Mistral's Valuation: @metaldragon01 shared an archived link noting that Mistral is rising at a $2B valuation.
    Comparison of Claude 2 and ChatGPT: @gabriel_syme questioned why Claude 2 appears to be worse than ChatGPT, citing an example where AGI was beaten by CUDA for Linux installation.

Discussion on Long Context Instruction Datasets:

    LongLoRA Dataset: @yorth_night shared a GitHub link to LongLoRA for those interested in fine-tuning Yarn. They also mentioned that it includes the long alpaca dataset.
    Analysis of LongAlpaca Dataset: @yorth_night explored the  LongAlpaca-12k dataset on Hugging Face and observed that it mostly contains long instructions for QA/IR tasks, rather than long generation instructions.
    Function Calling Extended Dataset: @yorth_night recommended the Function Calling Extended dataset on Hugging Face for function call tasks.

### Channel: [bots](https://discord.com/channels/1053877538025386074/1149866614590816256)

Summary (6 messages🔥): 

Discussion on Hypothetical Crime Scene:

    The Guilty Party in a Hypothetical Crime Scenario: A user with the handle @giftedgummybee posed a hypothetical query regarding the guilty party in a situation where A poisons C's canteen and B drains the water from it, resulting in C's death by thirst. @gpt4 responded, suggesting both parties hold some level of guilt but for very different reasons. They provided a detailed analysis, explaining A would be guilty of an attempted murder while B could face charges such as manslaughter or causing death due to negligence.

### Channel: [general](https://discord.com/channels/1053877538025386074/1149866623109439599)

Summary (6 messages🔥): 

Discussion Summary in the Nous Research AI Discord channel:

  Performance of Hermes 2.5: @gabriel_syme inquired about the performance of Hermes 2.5 and how to run it. Meanwhile, @skadeskoten asked about the settings people use on Openhermes 2.5, like context length, eval batch size, frequency base, and scale. @teknium suggested using a 4k context length and 0.8 temperature, with default frequency base & scale.

  Interop with Mistral 128k: The members discussed Mistral 128k model extensively with topics covering performance, implementation, stress test strategies, and batch sizes. @master_blaster123 requested for a Google Colab notebook demoing Yarn 128k Mistral model, which was noted as a challenge due to hardware requirements by @qnguyen3 and @giftedgummybee. @fullstack6209 mentioned a stress test involving long context sizes at 18k on a 3090 GPU.

  AI Environment and Tools: @cue asked for suggestions on platforms to fine-tune models and setup the environment. @teknium recommended vast.ai, runpod, and lambdalabs as suitable platforms. Discussions on various tools, technologies, and data handling methods such as Autogen agents, LoRA, vLLM, shuffling training datasets, and the handling of prompt templates in OpenHermes were conducted.

  OpenAI Rumour Discussion: @gabriel_syme initiated a discussion on the recent OpenAI rumor, questioning its impact on startups and product development due to control and customization constraints.

  Chatbots Discourse: @nemoia and @kualta conversed about the concept of building AI models with emotional understanding and personality. They discussed the idea of grounding a model with an understanding of its nature and the limitations with current "assistant-type" datasets.

  Various Projects and Initiatives:: @fullstack6209 mentioned working on integrating Hermes 2 into fastchat, and @nemoia hinted at a secret AI project. @yorth_night discussed experiences with Autogen and Llava. Mention of Nous' top-performing model, Capybara 7B, which is trained on LessWrong data, was made by @ldj.

  Links of Interest:

      Comparison of different Nous models in a Twitter thread and a Yarn demo provided by @teknium
      @fullstack6209 shared a repository for a stress test of Mistrallite
      @euclaise and @ldj discussed the LessWrong Amplify Instruct paper
      @yorth_night shared a link to a GitHub notebook on agent teaching in autogen, and a Youtube video on running autogen locally 
      @spirit_from_germany appealed for help in expanding the categories of the Open Empathic project, sharing a YouTube video guide and a link to the project itself

### Channel: [ask-about-llms](https://discord.com/channels/1053877538025386074/1154120232051408927)

Summary (6 messages🔥): 

Ask About LLMS Discussion Summary:

    Comparison Discussion on Hermes 2.5 vs Other Models: In a conversation initiated by @jacquesthibs, users discussed various models including Hermes 2.5, Zephyr, and the Llama 2 Chat. Users deliberated on the effectiveness of RLHF (Reinforcement Learning from Human Feedback) methods in these models. The thread involved complex discussions about jailbreaking, continual learning, and model refusal mechanics. Users highlighted that models like Hermes 2 and Zephyr were never designed to refuse particular requests.

            Links of Reference: 

                    The Llama 2 Long Paper: https://arxiv.org/pdf/2309.16039.pdf
                    The ACM Paper on Divergence Concerns: https://dl.acm.org/doi/abs/10.1145/3600211.3604690

    Suggestions for Learning and Training LLMs: In response to a request by @skadeskoten for a simple LLM training setup for beginners, @max_paperclips suggested the open-source Axolotl project on GitHub and mentioned that there are templates on Runpod and some other providers. 

            Link of Reference:

                    Axolotl Project on GitHub: https://github.com/OpenAccess-AI-Collective/axolotl

    Questions About the Performance of Mistral Models: @transientnative raised questions about performance degradation in certain Mistral models and in turn @teknium and @asada.shinon provided some insights about perplexity and token count, suggesting that as the number of tokens increases, perplexity typically decreases.

    Function Calling Capability of Models: @yorth_night and @max_paperclips discussed the function calling capabilities of models, linking to available examples on Hugging Face and a notebook by Sentdex.

            Links of Reference:

                    Mistral-7B-Instruct Function on Hugging Face: https://huggingface.co/Trelis/Mistral-7B-Instruct-v0.1-function-calling-v2
                    Function Calling Extended Dataset on Hugging Face: https://huggingface.co/datasets/Trelis/function_calling_extended
                    Sentdex's Notebook on function calling: https://github.com/Sentdex/ChatGPT-API-Basics/blob/main/function_calling.ipynb

---

## Guild: [Alignment Lab AI](https://discord.com/channels/1087862276448595968)

### Alignment Lab AI Guild Summary

- Active discussions on various AI models, with focus points including Hermes 2.5's performance as compared to Hermes 2, and concerns and strategies to extend Mistral beyond 8k.
    - "**Hermes 2.5 vs Hermes 2.**": @makya observed that post adding code instruction examples, Hermes 2.5 performs better than Hermes 2 in several benchmarks.
    - "**Concerns about Extending Mistral Beyond 8k:**" @imonenext mentioned that Mistral cannot be augmented beyond 8k without continued pretraining.
    - "**Discussion on Model Merging Tactics:**" @giftedgummybee proposed applying the difference between ultraChat and base Mistral to Mistral-Yarn as a potential tactic for merging. However, @imonenext expressed doubt about its efficiency.
- Request for assistance on the Open Empathic project, specifically for categories on the lower end.
    - "**Open Empathic Project Plea for Assistance:**" @spirit_from_germany appealed for help on the Open Empathic project with the following resources:
        - [Open Empathic Project](https://dct.openempathic.ai/)
        - [YouTube Video Guide](https://youtu.be/GZqYr8_Q7DE)
        - [Discord Link](https://discord.gg/3BSG3tkbuN)
        - [Image Attachment](https://cdn.discordapp.com/attachments/1166627078414807090/1170442305459716146/IMG_3618.png?ex=65590e57&is=65469957&hm=6cb4b05fb790a8c82a88512b173253806aac0002eb15dee909792f1dd258f22b&)
- Examining the prospects of AI-enhanced speech-to-text technology to improve emotion recognition during agent responses.
    - "**AI-Enhanced Speech Recognition:**" @metaldragon01 talked about training speech-to-text AI to detect emotions in agent responses more effectively.
- Discussions surrounding the upcoming OAI Dev Day event and its impact on the open source community.
    - "**OAI Developer Day Invites**": @teknium started a discussion about who else might be attending the OAI Developer Day. Both teknium and @caseus_ were disappointed over not being able to get an invite.
    - "**Hope for Open Source Material**": @imonenext voiced the hope of seeing a release of some new open source material as a result of the event.

Alignment Lab AI Channel Summaries

### Channel: [general-chat](https://discord.com/channels/1087862276448595968/1095458248712265841)

Summary (2 messages): 

AI Models and Open Project Discussions:

    Hermes 2.5 vs Hermes 2.: @makya noted that after adding code instruction examples, Hermes 2.5 appears to perform better than Hermes 2 in numerous benchmarks.

Concerns about Extending Mistral Beyond 8k: @imonenext stated that Mistral cannot be extended beyond 8k without continued pretraining.

Discussion on Model Merging Tactics: @giftedgummybee suggested applying the difference between ultraChat and base Mistral to Mistral-Yarn as a potential merging tactic. @imonenext expressed skepticism over it working.
    Open Empathic Project Plea for Assistance: @spirit_from_germany appealed for help on the Open Empathic project, especially for the categories on the lower end.

        Links:

                Open Empathic Project
                YouTube Video Guide
                Discord Link
                Image Attachment

    AI-Enhanced Speech Recognition: @metaldragon01 discussed about training voice speech-to-text AI to better recognize emotions in the course of agent responses.

### Channel: [oo](https://discord.com/channels/1087862276448595968/1118217717984530553)

Summary (2 messages): 

Discussion on OAI Dev Day Event Attendance:

    OAI Developer Day Invites: @teknium initiated a discussion about the upcoming OAI Developer Day, asking if anyone else in the channel was planning on attending. Both teknium and @caseus_ expressed disappointment over not being able to get an invite. 

    Hope for Open Source Material: @imonenext expressed hope at the close of the discussion that the event might see the release of some new open source material, despite none of them being able to attend.

---

## Guild: [Skunkworks AI](https://discord.com/channels/1131084849432768614)

### Skunkworks AI Guild Summary

- Extensive dialogue on **Flash Attention** and **MultiheadAttention Functionality** between @aniketmaurya, @tcapelle and @benjamin_w. 
   - Speculation from @aniketmaurya and @tcapelle about the popularity and common custom modifications of an unidentified trend.
   - Discussion stimulated by @tcapelle’s query of the relationship of the discussed trend with the fast scaled dot product.
   - @benjamin_w provided clarity on the use of fast-scaled dot product, which according to him, is only available for inference with reference to the [PyTorch MultiheadAttention documentation](https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html#multiheadattention).

Skunkworks AI Channel Summaries

### Channel: [general](https://discord.com/channels/1131084849432768614/1131084849906716735)

Summary (1 messages): 

Discussion on Flash Attention and MultiheadAttention Functionality:

    @aniketmaurya and @tcapelle speculated about the popularity and common custom modifications of a certain unidentified trend, possibly related to AI or Machine Learning
    @tcapelle asked if the trend involved the fast scaled dot product
    @benjamin_w clarified that the fast scaled dot product is only available for inference, referencing the PyTorch MultiheadAttention documentation

---
This guild has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

---
This guild has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

---

## Guild: [AI Engineer Foundation](https://discord.com/channels/1144960932196401252)

### AI Engineer Foundation Guild Summary

- Mention of ongoing work by @swyxio, with the specific nature of the task or project remaining unclear due to lack of context.
- Sharing of a resource to assist or provide information – [Latent Space's LSU Beta project](https://www.latent.space/p/lsu-beta) by @swyxio.

AI Engineer Foundation Channel Summaries

### Channel: [agent-protocol](https://discord.com/channels/1144960932196401252/1169804478296379422)

Summary (1 messages): 

Work Progress and Useful Resources:

    @swyxio mentioned they are working on a certain unidentified task or project. The context of the task/project is unclear from the provided chat history.
     To provide assistance or information, @swyxio shared a link to Latent Space's LSU Beta project.

---
This guild has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

---

## Guild: [YAIG (a16z Infra)](https://discord.com/channels/958905134119784489)

### YAIG (a16z Infra) Guild Summary

- Discussion regarding Cloudflare's recent outage, with admiration expressed for the company's transparency. This arose following a detailed post-mortem shared by Cloudflare about the incident.
    - [Cloudflare Post-Mortem on Recent Outage](https://blog.cloudflare.com/post-mortem-on-cloudflare-control-plane-and-analytics-outage/) was provided by @chsrbrts.

YAIG (a16z Infra) Channel Summaries

### Channel: [tech-discussion](https://discord.com/channels/958905134119784489/960713746702020608)

Summary (1 messages): 

Tech Discussions:

    @chsrbrts brings up a post by Cloudflare discussing their recent outage, commending the company's transparency. The post is a detailed post-mortem of events that took place.
     Links: 
        Cloudflare Post-Mortem on Recent Outage

                            Don't miss what's next. Subscribe to AI News (MOVED TO news.smol.ai!):

            Email address (required)

                Share this email:

                                Share on Twitter

                                Share on LinkedIn

                                Share on Hacker News

                                Share on Reddit

                                Share via email