[AINews] AI Discords Newsletter 11/4/2023
This is AI News! an MVP of a service that goes thru all AI discords/Twitters/reddits and summarizes what people are talking about, so that you can keep up without the fatigue. Signing up here opts you in to the real thing when we launch it 🔜
Guild: Latent Space
Latent Space Guild Summary
-
Discussion took place on the major challenges in accomplishing Artificial General Intelligence (AGI), with user binary6818 bringing attention to the topic.
-
User swyxio shared several articles from The New Stack on the state of AI development, including the projected AI engineer shortage, experiences from the AI Engineer Summit, and the role of open-source in the future of AI:
-
There was mention of OpenAI's GPT-3 interface updates and leaks courtesy of user swyxio, who referenced the article OpenAI's Massive ChatGPT Updates Leak Ahead of Developer Conference.
-
Conversations in the LLM Paper Club were centered around steering Language Learning Models (LLMs) to exhibit particular behaviours. Both stealthgnome and swyxio showed interest in exploring literature similar to Anthropic's Constitutional AI paper. No links or resources were provided in these discussions.
### Latent Space Channel Summaries
### Channel: [ai-general-chat](https://discord.com/channels/822583790773862470/1075282825051385876) Summary (2 messages ): - **Topic 1: Challenges to Achieve AGI** - User binary6818 highlighted the broad topic of "Challenges to Achieve AGI" - **Topic 2: AI Coverage by The New Stack** - User swyxio shared multiple articles from The New Stack, presumably discussing aspects of AI development: - [Tech Works: How to Fill the 27 Million AI Engineer Gap](https://thenewstack.io/tech-works-how-to-fill-the-27-million-ai-engineer-gap/) - [AI Engineer Summit: Wrap Up and Interview with Co-founder Swyx](https://thenewstack.io/ai-engineer-summit-wrap-up-and-interview-with-co-founder-swyx/) - [The AI Engineer Foundation: Open Source for the Future of AI](https://thenewstack.io/the-ai-engineer-foundation-open-source-for-the-future-of-ai/) - **Topic 3: OpenAI's updates and leaks** - User swyxio discussed leaks and rumors around OpenAI's GPT-3 interface, linking to an article titled [OpenAI's Massive ChatGPT Updates Leak Ahead of Developer Conference](https://the-decoder.com/openais-massive-chatgpt-updates-leak-ahead-of-developer-conference/). ### Channel: [llm-paper-club](https://discord.com/channels/822583790773862470/1107320650961518663) Summary (2 messages ): - Discussion Topic: Steering **LLM** to behave in a specific manner. - stealthgnome queried about papers similar to Anthropic's Constitutional AI paper. The interest was in methods used to steer **Language Learning Models (LLMs)** to show specific behaviors. - swyxio also asked the same question as stealthgnome about papers similar to **Anthropic's Constitutional AI**. - No links or blog posts were provided in the messages.Guild: OpenAI
OpenAI Guild Summary
Summarized Discussions:
-
Users discussing the customization of GPT-3's personality, the process of account sharing with ChatGPT Plus, fine-tuned models, and the retrieval-augmented generation technique (Discussed in ai-discussions).
-
Conversation regarding ChatGPT's capacity, various issues and solutions related to platforms and the API, and speculations about the future of AI and OpenAI's releases (Discussed in openai-chatter).
-
Broached issues about GPT-4 and DALL-E access, potential glitches with accounts, advice about multi-user access and feature usage, and guidance provided through reference to OpenAI official guidelines (Taken up in openai-questions).
-
Discussion about unexpected GPT-4 alpha access and subsequent queries about the commercial use of DALL-E images, doubts about OpenAI npm package, and concerns about GPT Plus’s capability to recognize the text within images (gpt-4-discussions channel).
Relevant Links:
- [Reddit Post](https://www.reddit.com/r/OpenAI/s/2liU0FCcQF/)
- [OpenAI's Article on Commercial Use of Dall-E Images](https://help.openai.com/en/articles/6425277-can-i-sell-images-i-create-with-dall-e)
- [Possible Related Link](https://discord.com/channels/974519864045756446/998381918976479273/1170047679775133796)
- [Link 1](<#1037561178286739466>)
- [Link 2](<#1037561751362863144>)
Key Quotes:
-
"Conversations with 4 count for your 50/3 hrs limit, and regens or sending new messages, or editing your messages count as a message with 4." - eskcanta
-
"you must have the authority to accept the Terms on their behalf. You must provide accurate and complete information to register for an account. You may not make your access credentials or account available to others outside your organization, and you are responsible for all activities that occur using your credentials" - eskcanta
-
"Ever since humans invented counting, rumors about The Next One have run wild." - solbus
-
"Images created with DALL·E can be used commercially in accordance with the Content Policy and Terms." - Zachisuppose
OpenAI Channel Summaries
Channel: ai-discussions
Summary (4 messages ):
-
Customizing GPT-3 Personality: lavender05 asked for advice on customizing the personality of a GPT-3 chatbot; solbus suggested using the "Custom Instructions" section of the respective application or website.
-
AI Product for Startups: world_designer recalled OpenAI not working for a completely new AI product for AI startups, but was unable to find a source.
-
Character Customization: eskcanta recommended CharacterAI, notably the "Slice Pizzaflush" character, for individuals seeking a specific type of humor.
-
AI Integration: anon9999 speculated on Elon Musk integrating "Grok" AI technology into company 'X'.
-
Account Sharing: zhengyuancheng brought up the topic of account sharing with ChatGPT Plus, questioning if simultaneous use by multiple individuals would be problematic or result in account suspension.
-
Fine Tuned Model and Retrieval Augmented Generation: omeranzar reported finding a method to input extra information into a fine-tuned model, implying successful Retrieval Augmented Generation.
-
Accessing the Beta Version: A user named leua61 asked about access to the beta or alpha version of ChatGPT Plus. world_designer explained it was necessary to join the ChatGPT Plus plan and wait for All Tools access.
-
GPT Responses: world_designer shared an observation of GPT responding twice. eskcanta proposed the plugin might be capable of sending further inputs to the model.
-
Access to the Alpha Version: An inquiry about accessing the alpha version of ChatGPT was made by gprapcapt3l, with elektronisade responding that without previous access it is currently unobtainable.
-
ChatGPT as a Language Model: mk______ asked for advice on feeding a significant amount of data (a 1.5k line code file) into a language model. eskcanta recommended breaking down the problem into smaller steps to avoid overwhelming the AI.
-
Links: hydverse shared a link to a Reddit post that gives some insight Reddit Post.
Channel: openai-chatter
Summary (4 messages ):
-
Discussion about ChatGPT's capacity and usage:
- Users discussed methods for making most out of their allocated usage, blending GPT-3.5 and GPT-4 utilization, and voice capacities of 3.5 to manage limited usage of GPT-4. Advice was given on how to optimally time the usage of GPT-4. eskcanta gave details on how to use voice with 3.5 and pointed out that the usage limit on 3.5 is based on tokens and not on the number of messages. The discussion also touched upon the different cut-off dates of knowledge for the GPT models. Quotes include, nomaddad saying: "Time is the one thing I will have," and eskcanta explaining: "Conversations with 4 count for your 50/3 hrs limit, and regens or sending new messages, or editing your messages count as a message with 4."
-
Conversation about user support and contact:
- Chat expressed frustrations and advice on reaching out to OpenAI staff, both on Discord and via email. It was shared that replies from OpenAI are not always helpful and at times, automated. solbus suggested the help.openai.com site and OpenAI's official support email support@openai.com. Quotes includes smilebeda saying: "You probably won’t succeed," solbus suggesting: "try help.openai.com if you haven't already," and dabuscusman advising: "contact support."
-
Discussion about issues with platforms and API:
- Multiple users reported issues such as the Android app crashing when generating images, contact-related issues with help.openai.com, and mystery about the 'All-Tools' feature. Users shared their experiences and offered advice. Possible fixes and alternatives were suggested, such as trying a different device, browser, network, and disabling VPN or Firewall. solbus provided assistance with troubleshooting.
-
Speculation about the future of AI and OpenAI releases:
- Discussion of speculations such as release of ChatGPT5 and potential increase in usage limits of GPT4. Lumirix hinted on possible news update. solbus humorously stated: "Ever since humans invented counting, rumors about The Next One have run wild."
-
Several off-topic and disjointed discussions:
- User asked for advice on a free note-taking AI, a recurring issue of ChatGPT asking users to verify as human, dealing with unspecified error and the inability to use web browsing. User asked about how to add APIs to their account. Model’s restriction on certain words for DALL-E image generation due to violation of OpenAI terms of service was brought up by lokee86.
In accordance with the requested format, no URLs were inputted in the above summary as none were given in the source text.
Channel: openai-questions
Summary (4 messages ):
-
Access and Usage Issues:
-
Users report issues with their access to GPT-4 and DALL-E features, including switching to GPT-3.5 during interactions ("Here's a conversation where it switches to 3.5 after a couple of messages." - FilmChaos) and issues with subscription renewal and account lockout ("Can you access GPT-4 Plus? I renewed my subscription, but it is locked" - tgbrkdlbz).
-
Users report issues with account creation and deletion. A user stated, "I can't sign up again with the same email because the account apparently 'already exists' but I also can't log in because the account has apparently also been 'deleted or deactivated'" (Top J).
-
Discussions around browser-based usage and troubleshooting, with multiple browsers and extensions mentioned. Solbus provides several insights and suggestions for resolving these issues.
-
-
Feature Queries and Usage:
-
Queries about multi-user access, shared access, and enterprise plans for organizations were discussed, with responses mentioning no official method for multi-user ChatGPT accounts but suggesting reaching out to OpenAI's sales team for information on ChatGPT Enterprise.
-
Probing into image recognition and text extraction capabilities of GPT Plus lead to observations such as a successful extraction of text from handwritten post-it notes, as noted by inleation.
-
Questions and discussion about new features in alpha testing, including 'All Tools' and the larger 32k context window. No official release date was given, but world_designer hinted at rumors.
-
Questions and guidance offered relating to file types and downloads possible within the chatbot, including pietman's report of download link issues.
-
-
Community Guidance and OpenAI Official Information:
-
The community provided several references, included direct quotes, to OpenAI's official guidelines to guide users: "you must have the authority to accept the Terms on their behalf. You must provide accurate and complete information to register for an account. You may not make your access credentials or account available to others outside your organization, and you are responsible for all activities that occur using your credentials" (eskcanta).
-
Links and chats from official OpenAI sources were given to guide users to the appropriate help, troubleshooting, or development information.
-
-
Known Glitches and Anomalies:
- Several glitches were reported, such as suspicious activity warnings and '.bin' files downloading instead of the site loading, leading to discussions about potential causes and troubleshooting steps.
Channel: gpt-4-discussions
Summary (4 messages ):
-
GPT-4 Alpha Access Discussions
- Wintre received unexpected access to GPT-4 Alpha and queried about the commercial use of images generated from Dall-E. Zachisuppose confirmed that images created with DALL·E can be used commercially under OpenAI's Content Policy and Terms. OpenAI's Article on Commercial Use of Dall-E Images
- The_lemon_man also had unexpected access to GPT-4 Alpha as a non-plus user. The reason for such access was left unclear.
- The_lemon_man reported an issue with the image generation capability of Dall-E 3. Solbus shared a link potentially related to the issue but didn't provide explicit details. Possible Related Link
-
Image Text Recognition Discussions
- Zhengyuancheng expressed doubts about GPT Plus's proficiency in extracting and recognizing text within images. Qacona recommended asking the bot to explicitly read the text.
-
Doubts about Using OpenAI npm package
Guild: LangChain AI
LangChain AI Guild Summary
- Discussion and requests for assistance in implementing and modifying LangChain Tools and Functions, such as strategies to handle chat models' confusion, writing Python code using multiple APIs, and modifying the agent prompt - see dent, noureldin_93431, mughi_94675, and brio99’s comments.
- Langchain Script Modifications and issue identification, including issues with environment variables, "Not Found" errors, port inconsistencies, and UnicodeDecodeError. Recommendations to manually export variables, modify scripts and use the most recent versions of Langserve and Langchain were shared.
- Introduction of the LangChain & OpenAI Python Client Wrapper Library developed by user pino_4321 which manages API request availability using OpenAI's TPM & RPM headers and supports key distribution. Link to PyPI for the python client wrapper shared:
- LangChain & OpenAI Python Client Wrapper on PyPI
- Guidance for using LangChain Agent provided in response to user queries with a direction to explore the concept of functions and a reference to a GitHub example.
- An introduction post made by madeinChinos joining the LangChain community.
LangChain AI Channel Summaries
Channel: general
Summary (4 messages ):
- Topic: Utilizing and Modifying LangChain Tools and Functions
- dent suggests a workaround for chat models confusion with context by asking the model about the required tool separately and then resending the request specifying the desired tool.
- noureldin_93431 requests assistance in writing Python code for a chatbot utilizing LangChain, SerpAPI, Chroma vector database, and an Azure chat model.
- mughi_94675 seeks guidance or materials on how to modify the agent prompt after the 'retrieve' function using LangChain.js and is aiming to specifically call the llm with a new prompt after the tool.
- brio99 asks for an explanation on why they can modify the prompt for one method (
stuff
) but not for another (map_reduce
) when using theRetrievalQA.from_chain_type
, pointing out that the documentation lacks details on this aspect. - madeinChinos introduces themselves as a new member to the LangChain community.
Channel: langserve
Summary (4 messages ):
- Langchain Script Modifications and Issue Identification:
- Users struggling with
langchain serve
not automatically loading environment variables from .env file as neither langchain or langserve uses dotenv as a dependency. Workaround of exporting them manually or modifying the scripts proposed. - Problems observed with server responding "Not Found" errors when attempting to access certain endpoints such as http://127.0.0.1:8000/pirate-speak/playground/.
- Despite port 8000 being defined, log messages refer to different ports, raising questions about potential inconsistencies.
- UnicodeDecodeError encountered after manual setup of pirate-speak.
- Call to open an issue in Langserve repo if problems persist with the most recent version.
- User confirms usage of the most recent versions for Langserve (0.0.22) and Langchain (0.0.330).
- Users struggling with
Channel: share-your-work
Summary (4 messages ):
- Topic: LangChain & OpenAI Python Client Wrapper Library
- Discussion point: User pino_4321 shared their work on a wrapper for the LangChain and OpenAI Python client library. They provided an overview of the functionalities it offers, including:
- The utilization of TPM & RPM headers from OpenAI to determine request availability, thereby addressing issues encountered with sequential retries and pre-configured limits.
- Supporting multiple keys to distribute requests uniformly across keys that have sufficient TPM & RPM.
- Link:
- LangChain & OpenAI Python Client Wrapper on PyPI shared by pino_4321.
- Discussion point: User pino_4321 shared their work on a wrapper for the LangChain and OpenAI Python client library. They provided an overview of the functionalities it offers, including:
Keywords: LangChain, OpenAI Python Client Library, Wrapper, TPM, RPM, Multiple Keys.
Channel: tutorials
Summary (4 messages ):
- Topic: Using LangChain Agent and troubleshooting
- mughi_94675 advised @1030089231118389288 to utilize the concept of functions with LangChain agent.
- In response to hamza_sarwar_'s request for elaboration, mughi_94675 suggested checking an example on GitHub.
- ashok_71342 requested assistance with an unspecified error related to LangChain.
Guild: Nous Research AI
Nous Research AI Guild Summary
-
Discussed how to emulate ZRAM on MacOS using zstd and memory arenas, with a StackExchange link shared by chadbrewbaker for ramdisk on MacOS.
-
Talk on the challenges encountered in finetuning AI models, notably the problem of a token outlier triggering system crashes. Users suggested conducting a pre-training filter to prevent such issues.
-
Recommended the LLMfarm app for running Hermes-2.5 on iPhones, an effective solution for users in emergency situations.
-
Explored the role of multimodal llm in the field of App logo design, with the suggestion of employing GPT-4 and DALL-E for the task.
-
Shared AI model links, and datasets including Vision-Flan and LongAlpaca. Discussion extended to AI companies valuation, performance of AI models, and datasets for function calls.
-
Hypothetically analyzed a complex legal scenario involving two perpetrators and one victim, with contributions from both community members and an AI chatbot.
-
Focused on the requirements and implications of running OpenHermes and customizing AI Models. Looking into possibilities like Paged Attention for enhanced model and hardware utilization.
-
Shared insights on the performance and testing of different AI models like Yarn 128k Mistral, AWQ, and vLLM considering factors like context sizes, token processing per second, and GPU usage.
-
Dialogues on utilizing AI assistants, the cost of GPT-4 API, the LessWrong dataset and its associated costs for training AI models. Several projects were discussed, including an AI anime robot and the upcoming "Nous 7B" model.
-
Explored the topic of fine-tuning AI models and methods to prevent instructional deviations. Users suggested examining active learning models and shared a link to an ACM paper.
-
Discussed the specifics different AI models, like Llama-2 and Mistral, along with their training setups and capabilities. Furthering the discussion were guidelines for small LLM training setups and fine-tuned models for philosophical tasks.
-
Questioned the performance of long-concept AI models, such as the Yarn-Mistral-7b models. Reasons suggested included the quality of the training dataset and human comprehension of information in context to token count.
Nous Research AI Channel Summaries
Channel: ctx-length-research
Summary (6 messages 🔥):
- Topic: Emulating ZRAM on MacOS
- Discussion point: User chadbrewbaker queried about code to emulate ZRAM on MacOS using zstd and memory arenas.
- Link shared: chadbrewbaker shared a link to StackExchange where code for ramdisk on MacOS is discussed.
Channel: off-topic
Summary (6 messages 🔥):
-
Topic 1: Issues with Tokens in Finetune
- yorth_night encountered technical issues while finetuning, including system crashing and data manipulation issues, suspected to be linked with a 235k token outlier conversation
- youngphlo shared past experience of dealing with a similar issue where one row exceeded 100k tokens
- giftedgummybee recommended filtering datasets before proceeding to train them - a preventative measure against such issues
- yorth_night specified the issue occurred at the tokenization phase and while padding to max length, finding the situation to be somewhat humorous.
-
Topic 2: Mobile Deployment of AI Models
- tsunemoto shared a recommendation for the LLMfarm app, which allows users to run Hermes-2.5 effectively from their iPhones with minimal quality loss and good speeds. The user found this to be a beneficial possibility in emergent situations, provided they have access to a power source for their phone.
-
Topic 3: Multimodal LLM for App Logo Design
- iamgianluca sought advice on a suitable multimodal llm for logo design for their pet project application.
- teknium suggested the combination of GPT-4 and DALL-E for such a task.
Channel: interesting-links
Summary (6 messages 🔥):
-
Topic: AI Models and Datasets
- Euclaise shared a model link from huggingface.co for Vision-Flan and Cybertimon expressed a hope for the model to be quantized.
- Cybertimon remarked on the model's promising benchmarks, commenting how it's better than falcon 180b! Link
- Yorth_night provided a link for finetuning yarn, specifically mentioning the long alpaca dataset. Link
- Yorth_night analyzed the LongAlpaca dataset from Hugging Face, pointing out that it was mostly composed of QA/IR, with long instructions for information retrieval, but lacked long generation instruction. Link
-
Topic: AI Companies and Valuation
- Metaldragon01 shared an article about Mistral raising funds at a 2 billion valuation, eliciting surprise among the members. Link
-
Topic: AI Model Performance
- Gabriel_Syme questioned the performance of Claude 2, expressing surprise that it was outperformed by CUDA for Linux installation.
-
Topic: Dataset for Function Calls
- Yorth_night recommended a dataset from Hugging Face for function calls. Link
Channel: bots
Summary (6 messages 🔥):
Topic: Hypothetical Legal Dilemma
-
Discussed a theoretical legal situation where two individuals, A and B, independently decide to murder another individual, C. Various scenarios of the event were proposed.
- Quote from Makya: "At a desert oasis, A and B decide independently to murder C. A poisons C’s canteen, and later B punches a hole in it. C dies of thirst. Who killed him? A argues that C never drank the poison. B claims that he only deprived C of poisoned water. They’re both right, but still C is dead. Who’s guilty?"
-
The AI chatbot, GPT4, provided an analysis of the aforementioned scenario considering hypothetical legal perspectives.
- Quote from GPT4: "Both A and B are guilty but for different reasons. A is guilty of attempted murder because he intended to poison C. B is guilty as he contributed to C's death by depriving him of safe water. His actions might not have directly lead to C's death, but he undoubtedly played an important role in it." and noting "this is a hypothetical answer to a hypothetical situation and wouldn't necessarily be accurate in a real court of law, as that would depend heavily on the specific laws in place and actual circumstances surrounding the event."
Channel: general
Summary (6 messages 🔥):
Running OpenHermes and AI Model Customization: - User gabriel_syme discussed the need for clear instructions for running open-hermes 2.5 and expressed concerns about control and customization with AI Models. Commented, "The main problem with every offering like that is control and customization". - User emrgnt_cmplxty suggested using OpenAI's quickstart docs for handling model configurations. - User teknium shared a possible confusion between chatml and llm during model integration.
Integration of AI Models and Perfomance: - User fullstack6209 shared a GitHub repository containing a stress-test for mistrallite-vllm, and discussed the potential of improved model and hardware utilization, including the potential of running GPT-3.5 on standard CPUs through new techniques such as Paged Attention. - User skadeskoten inquired about the current context length on OpenHermes 2.5 - User alstroemeria313 asked why models aren't released as safetensors, to which giftedgummybee attributed this to Axolotl's default setup.
Testing and Experimentation with AI Models: - Users engaged in a series of technical discussions about the performance and testing of various models like Yarn 128k Mistral, AWQ, and vLLM. They explored factors like context sizes, token processing per second, and GPU usage.
AI Assistants and API Use: - Users yorth_night and gabriel_syme discussed the usage of “agents” and the SOP (Standard Operation Process) abstraction in simplifying AI tasks. They also discussed the need for "Unit tests for llms." - User tsunemoto brought up concerns about the cost of GPT-4 API.
LLM and Datasets: - User ldj discussed the LessWrong dataset and the costs associated with using it in training AI models such as “Capybara 7B” and multimodal model "Obsidian."
Project Development: - Users euclaise, nemoia, and kualta discussed project development, with nemoia commenting on her upcoming project focused on developing a highly sophisticated AI anime robot girl. - User max_paperclips inquired about the SOP (Standard Operation Process) abstraction. - User ldj mentioned working on the "Nous 7B" model and referenced his hugging face collections and the announcements channel for updates.
URLs/LINKS shared: - Users shared several Twitter links providing demonstrations of AI models. - User nemoia Microsoft's Autogen - User ldj made a reference to his Hugging Face Collections. - User fullstack6209 shared their Stresstest repo for mistrallite-vllm model. - User nemoia shared Matthew Berman's YouTube link about MemGPT. - User rishi.mg posted the scientific paper link https://arxiv.org/pdf/2310.08560.pdf - User spirit_from_germany shared links related to Open Empathic Project and YouTube Tutorial
Channel: ask-about-llms
Summary (6 messages 🔥):
-
Topic 1: Fine-Tuning Methods for AI Models
- Discussion points and quotes:
- A debate regarding the fine-tuning methods of AI Models, specifically related to the susceptibility of instruct-tuned models versus RLHF models to going 'crazy' in a continual learning context. User teknium stated the capability of partially injecting model responses to bypass any refusal, RLHFed or not.
- Jacquesthibs expressed interest in studying fine-tuning methods and active learning to better understand how to prevent a model from deviating.
- Teknium argued that RLHF might not be the most viable solution for continuous learning without diverging the model.
- Links:
- ACM paper was referenced in the discussion about the possibility of making AI models harder to 'go crazy' or fine-tuning for specific capabilities.
- Discussion points and quotes:
-
Topic 2: Specific AI Models and Training Setups
- Discussion points and quotes:
- Questions about various AI models and the details of their training, with a focus on the Llama-2 model and Mistral. Jacquesthibs and teknium discussed the instructional tuning and refusal capabilities of several AI models.
- A query about fine-tuned models for philosophic tasks from user Kualta, though no specific answer was given.
- User Skadeskoten inquired about recommendations for a small LLM training setup for beginners, with max_paperclips suggesting the axolotl and providing templates on Runpod and other providers.
- Links:
- Discussion points and quotes:
-
Topic 3: Performance of Long-Context Models
- Discussion points and quotes:
- Transientnative questioned the reasons behind the degraded performance of the Yarn-Mistral-7b-128k and Yarn-Mistral-7b-64k models. Teknium suggested it was due to the quality of the training dataset used.
- Asada.shinon explained that perplexity decreases as the token count increases because humans tend to get less confused with more information provided.
- Bloc97 mentioned three potential reasons behind the different performance between their model and the LLAMA-2 long model: training token count, data quality, and extension size.
- Discussion points and quotes:
Guild: Alignment Lab AI
Alignment Lab AI Guild Summary
- Members conversed about the comparative performance between Hermes 2.5 and Hermes 2, noting with code examples that Hermes 2.5 appears to perform better in various benchmarks.
- A concern was raised over the limitations of extending Mistral beyond 8k without continued pretraining, as pointed out by Imonenext.
- An ongoing discussion about model merging was highlighted, specifically the proposal by Giftedgummybee of applying the difference between UltraChat and base Mistral to Mistral-Yarn. A mixture of skepticism and optimism was noted in the channel.
- Spirit_from_germany made a community-wide appeal for help in contributing to the Open Empathic project, especially in expanding its lower-end categories. They shared a tutorial video and a direct link to the project for interested members.
- Discord's upcoming participation in OAI Dev Day was discussed, despite the lack of invitations received by some members. Expectations for potential open-source contributions from the event were also highlighted.
- An isolated instance of referencing an unrelated Twitter post by Apple's Jimmy was noted, which was considered as non-informative. The Twitter link was shared in the chat.
Alignment Lab AI Channel Summaries
Channel: general-chat
Summary (2 messages ):
- Hermes 2.5 vs Hermes 2 Performance:
- Makya noted that after adding code instruction examples, Hermes 2.5 appears to perform better than Hermes 2 in various benchmarks.
- Concerns about Extending Mistral Beyond 8k:
- Imonenext stated that Mistral cannot be extended beyond 8k without continued pretraining.
- Discussion on Model Merging Tactics:
- Giftedgummybee suggested applying the difference between UltraChat and base Mistral to Mistral-Yarn as a potential merging tactic. Imonenext expressed skepticism, but giftedgummybee remained optimistic, citing successful past attempts at what they termed "cursed model merging".
- Open Empathic Project Plea for Assistance:
- Spirit_from_germany appealed for help in expanding the categories of the Open Empathic project, particularly at the lower end. They shared a YouTube video that guides users to contribute their preferred movie scenes from YouTube videos, as well as a link to the project itself.
Channel: oo
Summary (2 messages ):
- Topic: OAI Dev Day Attendance
- Discussion points and quotes:
- teknium asked if anyone is going to OAI dev day.
- caseus_ and teknium shared that they did not receive an invitation.
- imonenext inquired about the date, which teknium provided as the coming Monday.
- imonenext expressed curiosity about potential open source contributions from the event.
- Discussion points and quotes:
-
Reference to an unrelated twitter post by Apple's Jimmy, which teknium pointed out didn't convey any meaningful information.
-
Links:
Guild: Skunkworks AI
Skunkworks AI Guild Summary
- The conversation revolved around AI trends and technical discussion.
- User aniketmaurya highlighted the prominence of a specific trend in AI predating the emergence of flash attention.
- User tcapelle raised a query regarding the implementation of custom modifications in AI programs.
- tcapelle also asked if the said program incorporates a fast scaled dot product.
Skunkworks AI Channel Summaries
Channel: general
Summary (1 messages ):
- Topic: AI Trends and Technical Discussion
- Discussion Points:
- aniketmaurya pointed out the popularity of a certain trend in AI even before the rise of flash attention.
- tcapelle brought up the point of implementing custom modifications within AI programs.
- tcapelle inquired if the mentioned program implements a fast scaled dot product.
- Discussion Points:
This guild has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
This guild has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
Guild: AI Engineer Foundation
AI Engineer Foundation Guild Summary
- Initial mention by swyxio about their work-in-progress project, however, no further details were provided.
- Relevant link shared by swyxio leading to a page about LSU Beta on latent.space: https://www.latent.space/p/lsu-beta.
AI Engineer Foundation Channel Summaries
Channel: agent-protocol
Summary (1 messages ):
- Topic 1: Introduction to a Work-In-Progress Project
- swyxio mentions they are working on a project. No details about the project were given.
- Links:
- Swyxio shared a link pointing to a page about LSU Beta on latent.space: https://www.latent.space/p/lsu-beta
This guild has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
Guild: YAIG (a16z Infra)
YAIG (a16z Infra) Guild Summary
- Ongoing conversation pertaining to the transparency in tech companies, prompted by Cloudflare's Post-Mortem report linked by chsrbrts:
- Cloudflare Post-Mortem Report was discussed, attributed to their exemplary open approach when dealing with incidents.
YAIG (a16z Infra) Channel Summaries
Channel: tech-discussion
Summary (1 messages ):
- Discussion on Cloudflare's Post-Mortem Report
- chsrbrts shared a link to a post mortem report by Cloudflare, indicating respect for their transparency.
- Link: