[AINews] AI Discords Newsletter 11/28/2023

gained through the Langchain website

        November 28, 2023

[AINews] AI Discords Newsletter  11/28/2023

This is AI News! an MVP of a service that goes thru all AI discords/Twitters/reddits and summarizes what people are talking about, so that you can keep up without the fatigue. Signing up here opts you in to the real thing when we launch it 🔜

        Latent Space Discord Summary

Discussion on Multi-modal Language Models, particularly GPT4+vision, with @philltornroth seeking references to understand the tokenization process and the interaction between vision and language components.

Inquiry about Context Management for Agents / RAG, with @slono looking for resources that explain this concept from a mathematical or category theoretical perspective, highlighting the modeling of shared context or memory among agents.

Suggestions for LLM Paper Club, @yikesawjeez proposed reviewing two papers regarding prompt improvements and RAG context management: 

https://arxiv.org/abs/2311.09277
https://arxiv.org/abs/2311.05997

Final decision to review a specific RAG context management paper as per @eugeneyan's advice: https://arxiv.org/abs/2311.09210

Proposal by @slono to review a blog post on Lookahead Decoding for future paper club discussions: https://lmsys.org/blog/2023-11-21-lookahead-decoding/

Latent Space Channel Summaries
▷ Channel: ai-general-chat (7 messages):

Understanding Multi-modal Language Models: @philltornroth sought recommendations for references that elaborate on the vision components of multi-modal language models like GPT4+vision, particularly understanding the tokenization process and the interaction between vision and language within the model's context.

Resources on Context Management for Agents / RAG: @slono asked for resources that approach context management for agents/Rapidly-exploring Random Trees (RAG) from a mathematical or category theoretical perspective, with emphasis on modeling shared context/memory among agents, that includes the ability to "edit" history or summarize it.

▷ Channel: llm-paper-club (9 messages):

Upcoming Paper for Discussion: @yikesawjeez suggested a paper on prompt improvements https://arxiv.org/abs/2311.09277 and another one for alternate option https://arxiv.org/abs/2311.05997
Finalized Paper for Study: @yikesawjeez has finalized to review a paper related to RAG context management https://arxiv.org/abs/2311.09210 as per @eugeneyan's advice.
Future Paper Discussion Proposal: @slono has proposed to review a blog post on lookahead decoding at some point https://lmsys.org/blog/2023-11-21-lookahead-decoding/.

OpenAI Discord Summary

Discussion on OpenAI Whisper API and Speaker Diarization: Users discussed the use of the OpenAI Whisper API for audio transcription and the lack of a speaker diarization feature. A notable GitHub link to the Whisper API and package pyannote/speaker-diarization-3.1 were shared, but concerns were raised regarding the lack of Node.js support for this Python package.
Observations on ChatGPT Performance and Potential Downgrade: Numerous users expressed concerns about perceived performance degradation of GPT-4, attributing it to possible downgrades after high demand or as new data was added. Discussions focused on the need to refine prompts more often and encountering quota limits.
Extensive debate on Potential Bias in Language Large Models (LLMs): Users delved into the bias towards English in LLMs such as GPT-3 and GPT-4, discussing potential reasons for the bias and its implications on the democratization and universality of AI technology.
Experiences and issues with ChatGPT: User frustrations were expressed underlining the hitting of quota limits and perceived issues with customer support. Other users provided possible solutions and mitigation recommendations.
Deliberations on ChatGPT's response speed, Errors and Usage limits: Slow response and login errors were reported, with troubleshooting suggestions provided. Some users expressed limitations encountered in message limits and eagerness for accessing GPT-4, being asked to join waitlists. 
Users explored ChatGPT as a Development Tool, and Embedding and Querying Data with GPT, and voiced their Experiences and Appreciation for ChatGPT.
Examination of Custom GPTs, Technical Issues and Troubleshooting: Usage cap differences between custom GPTs and regular GPT-4 was brought up. Several technical troubleshooting tips were shared ranging from logging out and back in, disabling AVG's Anti-Track feature, to testing with Firefox. Furthermore, users sought guidance on cleaning data for GPT knowledge bases and the limits in file sizes/context.
User questions on Conversation/Plugin mistakes and Improvements, and specific API queries resulted in community-driven suggestions and advice.
In-depth discussion on Teaching GPT about Morals, Advanced Math and GPT-4 Usage Limit yielded promising results and insights about the backend processes and technical requirements. User-generated GPT "Great Sage," trained in morals and math was shared: Great Sage.
Speculation on the Launching of the OpenAI GPT Store and deliberations on using Custom GPTs for Enterprise Applications brought forth community expectations, considerations and practical advice on how these processes work.
Detailed discussions regarding Interactive Fiction GPT Development and help sought for Writing Article Prompts led to community engagement and assistance.
Speculative thread on OpenAI Models & Boston Dynamics' Robots, and a debate on OpenAI Model’s Self-awareness brought forth intriguing concepts and divided opinions.
A User-specific problem with DALL·E's Spelling Issues was shared, with advice sought from the community.

OpenAI Channel Summaries
▷ Channel: ai-discussions (122 messages):

Discussing OpenAI Whisper API and Speaker Diarization: @healer9071 and @elektronisade discussed the use of the OpenAI Whisper API for audio transcription. A GitHub link to the Whisper was shared. They further discussed the lack of a speaker diarization feature in the Whisper API, with suggestions to look up "whisper API diarization" and recommending the pyannote/speaker-diarization-3.1 Python package. Notably, there were concerns about the lack of Node.js support for this Python package.
ChatGPT Performance and Potential Downgrade: Users @cr7462, @ahlatt, @xyza1594, @strategam, @johnpringle, and @dogdroid among others, discussed the perceived decrease in performance of GPT-4, speculating that it might have been "nerfed" after high demand or as new data was added. They discussed the need to refine prompts more often and limitations like hitting quota limits.
Potential Bias in Language Large Models (LLMs): In an extensive discussion @posina.venkata.rayudu and @.dooz considered the bias towards English in LLMs, such as GPT-3 and GPT-4, specifically in relation to an article discussing "AI language problem". They discussed potential reasons for such a bias and the implications on the democratization and universality of AI technology.
Introducing New User: User @maxiisbored requested assistance in creating an introduction message for themselves. A detailed introductory message was crafted focusing on points like about Maxi, where they're from, why they're interested in ChatGPT and a call for connection.
User Frustration with ChatGPT: @NastyTim expressed dissatisfaction with the perceived performance of ChatGPT, addressing issues like hitting quota limits and the lack of responsive customer support. Other users, including @.dooz, @kesku, and @satanhashtag, responded with possible solutions and recommendations to mitigate these concerns.

▷ Channel: openai-chatter (380 messages🔥):

ChatGPT Response Speed and Errors: Users have reported slow response speeds on both GPT-3.5 and GPT-4, attributed by some to high server loads. User @openheroes suggests using the thumbs up or down button as feedback. There have also been log in errors for some users with these being resolved by logging out and then back in.

GPT Model Usage and Access to GPT-4: Some users noted experiencing limitations in the number of messages with @aesir99 observing a change from 50 to 40 messages. @m_12091 enquired about upgrading to GPT-4 and received advice from other users about signing up for the waitlist. @kryptoflow shared the sentiment that GPT-4 is highly anticipated.

ChatGPT as a Development Tool: @xyza1594 discussed the utility of GPT as a tool for coding and document updates. User @lugui expressed preference for GitHub Copilot due to its integration with the IDE and auto-completion features. 

Embedding and Querying Data with GPT: @mungooo expressed interest in uploading a CSV or JSON file to query data using GPT. 

Experiences and Appreciation for ChatGPT: Various users expressed their appreciation for ChatGPT, with User @quanta1933 enjoying how GPT-3.5 livens up their active imagination and user @kryptoflow expressing amazement at GPT-3.5.

▷ Channel: openai-questions (235 messages🔥):

Dealing with Custom GPTs and Usage Cap: @solbus explained the structure and capabilities of custom GPTs and suggested effective strategies for utilizing them such as focusing on the 'Configure' tab after the initial setup in 'Create' to avoid high usage. Discussions revealed that the usage cap for custom GPTs seems to be different and perhaps lower than for regular GPT-4, possibly closer to 25 uses per 3 hour period, as opposed to GPT-4's 40 uses.

Technical Issues and Troubleshooting: Several users reported issues with using OpenAI's services. @sakhalinsk2 and @spinnercruz experienced persistent errors when trying to use GPT-4, with remedies such as logging out and in, and disabling AVG's Anti Track feature solving the latter's problem. @mr_sgtx asked about recovering a lost conversation, @tuxmaster couldn't open or delete a GPT transcript from his history, @0xaking was having issues getting their API actions to work, and @satanhashtag suggested testing with Firefox.

Data Considerations for GPTs: @bywilliaml asked for guidance on cleaning data for uploading to GPT knowledge bases and on limits in file sizes/context. @lugui highlighted the cost involved with larger files and the large token context limit of AI models.

Conversation/Plugin Goof-ups and Improvements: @crerigan chuckled over having translated a 500-page PDF manually when they could have used an existing plugin for translation. @halalarax sought help in enabling plugins and the Advanced Data Analysis feature after upgrading to GPT Plus subscription.

API queries: @mr_baconhat asked how to use GPT-4 in API requests while @mungooo struggled to get the assistant playground to access a CSV file. @0xaking was attempting to utilise a streaming TTS (Text-to-Speech) API with Node.js, while @quantaraum was looking for the same thing but was stymied due to the lack of a JS streaming example in the OpenAI docs. @captivating_courgette_19070, on the other hand, inquired about the sequence for uploading training and validation set files and starting the fine-tuning process.

▷ Channel: gpt-4-discussions (77 messages):

Teaching GPT about Morals and Advanced Math: User @mikepaixao has created a GPT named "Great Sage" that has been trained on morals and advanced math. The bot was shared with the link: Great Sage.
Issues with GPT-4 Usage Limit: @dotails discussed the potential for increasing GPT-4 usage flexibility, specifically on the ability to switch from GPT-3.5 to GPT-4 when the wait time is over. @eskcanta confirmed that the current cap for GPT-4 is 40/3 hours and the suggestion for increasing flexibility may be difficult to implement.
Training GPTs Built Using GPT Builder: @haribala sought advice on how to train GPTs that are built using the GPT builder. @pietman explained that GPTs are already trained models, users can connect the model to an external API, provide it with instructions, and give it a knowledge base to reference.
Launching of the OpenAI GPT Store: @.xiaoayi raised a question about the launch date of the official OpenAI GPT store. @rjkmelb and @kyleschullerdev_51255 both informed that there have been no updates yet. @kyleschullerdev_51255 speculated that the launch might be delayed due to security flaws.
Using Custom GPTs for Enterprise Applications: @phild66740 discussed using Custom GPTs for prototyping internal applications, before fully integrating them into their enterprise systems. @solbus explained that a custom GPT's behavior is entirely dictated by the "Configure" screen and that nothing happens behind the scenes past what's defined there. @kyleschullerdev_51255 further clarified that the GPTs created are available through the Assistants API, which integrates them into applications or websites.

▷ Channel: prompt-engineering (31 messages):

Interactive Fiction GPT Development: User @stealth2077 shared examples of their work on interactive fiction using a GPT model, aiming for a consistent style and format. @solbus showed interest and asked for further information about the configuration of the GPT's "Configure" page and whether it uses knowledge files or instructions fields. 
Writing Article Prompts: @komal0887 asked for help in creating prompts to generate an article without evaluative sentences using "gpt-3.5-turbo-instruct". 
OpenAI and Boston Dynamics Cooperation: @davidvon raised a question whether OpenAI’s large models might cooperate with Boston Dynamics’ robots and suggested that AI should have a physical body. @eskcanta responded humorously that maybe some already do.
DALL·E's Performance: @kh.drogon expressed difficulty in getting DALL·E to spell a word correctly on an image.
Discussion on GPT Models and AI: A brief exchange took place among @davidvon, @jaynammodi, and .pythagoras discussing if OpenAI's large models have free will and self-awareness, and the state of AI in general. There was considerable disagreement and speculation on this topic.

▷ Channel: api-discussions (31 messages):

Interactive Fiction GPT Development: User @stealth2077 shared an example of the kind of consistent style and format they're hoping for their GPT to produce for building an interactive fiction GPT. User @solbus asked about the organization of the Configure page and whether @stealth2077 used any knowledge files or instructions. 
Generating Non-Evaluative Articles: User @komal0887 sought assistance in creating a prompt that generates articles without evaluative sentences using the "gpt-3.5-turbo-instruct" model.
OpenAI Models & Boston Dynamics' Robots: User @davidvon speculated about OPEN AI's large models' potential cooperation with Boston Dynamics' robots, implying AI having a physical body.
OpenAI Model’s Self-awareness Discussion: Users @davidvon, @jaynammodi, and .@pythagoras debated whether OpenAI's large model possesses self-awareness and consciousness. @jaynammodi suggested it has a certain degree of consciousness due to simulated control over actuators and sensors, while .@pythagoras argued that the robot doesn't possess consciousness and is merely voice-operated by a GPT model.
DALL.E's Spelling Issues: User @kh.drogon shared a problem about DALL.E consistently misspelling words when creating logos, asking for advice.

LangChain AI Discord Summary

Dialogue about Langchain's possible slim version due to space consumption issues of AWS Lambda layers featuring LangChain, initiated by @dannyhabibs.
Requests for Langsmith access, gained through the Langchain website, as mentioned by @seththunder.
Discussions on agent routing methods concerning intent classification via Llama2, with problems surrounding AgentType.ZERO_SHOT_REACT_DESCRIPTION noted by @doogan1211.
Inquiries about missing modules, specifically RunnableBranch, RunnableSequence, in the Langchain schema because they couldn't find the Runnable module mentioned by @eminence.
Search for an alternative to GPT3 Turbo 16k that can handle at least 6k token LLm initiated by @jokerssd.
SSL verification issue when using pandas agents/tiktoken through Langchain as reported by @varshinirk_16759.
Talks revolving around variables in ConversationalRetrievalQAChain objects, with @menny9762 searching for a way to pass {lang}.
Assistance request for fine-tuning a LangChain agent working as a sales assistant and connecting to a SQL Database by @l0st__.
The compatibility of Langchain SQLDatabase agent & toolkit with OpenAI proposed by @gloria.mart.lu, as tests with other LLms had unsuccessful results.
Diverse views over agents like ConversationRetrievalQA, RetrievalQA, and RunnableSequence and their functionality, explained by @seththunder.
Comments regarding chatbot interface for agents, raising the need for improved UI's to present information from agents, discussed by @moooot.
Suggestions to @l0st__ to fine-tune an OpenAI model for better results by @rpall_67097.
A reported problem with the new Assistants API where the assistant returns the question instead of the answer by @sheldada.
The consideration of using OpenAI's vector stores without Pinecone discussed by @jupiter.io.
Two major high-level approaches, RAG (Retrieval-Augmented Generation) and fine-tuning, for chat application types explained by @rpall_67097.
Calls for support of Django in LangChain AI by @aliikhatami94, and the scope for enabling callback event sending explained by @veryboldbagel.
Mention of some instances where type inference can get strange, suggestion to use with_types to define expected inputs by @veryboldbagel.
Guidance snippet on setting up a chat widget and warning of possible breaking change to the output key by @veryboldbagel.
Coding difficulties in setting {lang} parameters in LangChain templates addressed by @menny9762, and the need for Q&A features for tabular data highlighted by @steve675.
Announcement of a simplistic Docker setup for LangChain's research assistant template and sharing of the setup instructions on GitHub by @joshuasundance.
Notice of BERTopic's latest support for LCEL runnables and its potential applications suggested by @joshuasundance.
Shared Medium article by @andysingal demonstrating LangChain+LlamaIndex's potential in semi-structured data.
Tutorial on deploying Langchain on Cloud Functions using Vertex AI models for scalability shared by @kulaone here.

LangChain AI Channel Summaries
▷ Channel: general (51 messages):

Langchain Slim Version Request: @dannyhabibs encountered a problem where multiple AWS Lambda layers including LangChain take up too much space. They are exploring options to slim down the LangChain layer.
Access to Langsmith: @sampson7786 requested for an invite code to access Langsmith. @seththunder suggested that it's typically granted via the Langchain website.
Agent Routing with Llama2: @doogan1211 is seeking advice on building an intent classification method for agent routing to various tools. They had issues with AgentType.ZERO_SHOT_REACT_DESCRIPTION, which is currently supported.
Missing Runnable Module in Schema: @eminence questioned the absence of the module RunnableBranch, RunnableSequence in the schema as they can't find the runnable module.
GPT3 Turbo 16k Alternative: @jokerssd asked for an alternative to GPT3 Turbo 16k that can handle at least 6k token LLm. 
SSL Verification Issue: @varshinirk_16759 is experiencing a SSLCertVerification error when using pandas agents/tiktoken through Langchain. 
Passing Variables in ConversationalRetrievalQAChain: In a conversation between @menny9762 and @seththunder, the former was looking for a way to pass the variable {lang} to a ConversationalRetrievalQAChain object.
Seeking Help for LangChain Agent Fine-tuning: @l0st__ is searching for someone to fine-tune their LangChain agent, which connects to a SQL Database and acts as a sales assistant.
Langchain SQLDatabase Agent & Toolkit Compatibility: @gloria.mart.lu questioned whether Langchain SQLDatabase agent & toolkit is exclusively compatible with OpenAI. They reported that tests with other LLms did not yield good results.
RetrievalQA vs ConversationalRetrievalQA: @menny9762 wondered about the difference between ConversationRetrievalQA, RetrievalQA, and RunnableSequence. @seththunder commented that RetrievalQA does not use memory whereas ConversationRetrievalQA does and it rephrases the question as a standalone question.
Agent Interface: @moooot raised a question about the current chatbot interface for agents, wondering why there aren't better UIs for presenting information delivered by agents.
Fine-tuning OpenAI Model: @rpall_67097 suggested @l0st__ to fine-tune an OpenAI model for better results compared to working with a foundation model via a LangChain Agent.
Issue with New Assistants API: @sheldada reported an issue with the new Assistants API where the assistant returned the question instead of an answer.
OpenAI Vector Stores Usage: @jupiter.io inquired for a way to use OpenAI's vector stores without going through Pinecone.
RAG and Fine-Tuning: @rpall_67097 outlined two main high-level approaches, RAG (Retrieval-Augmented Generation) and fine-tuning, sharing their usefulness for different chat application types.

▷ Channel: langserve (5 messages):

Potential Django Support: @aliikhatami94 expressed a desire for support of Django in LangChain AI. @veryboldbagel responded that Django support is not currently available but is interested in feedback about the level of demand.

Enabling Callback Event Sending: @veryboldbagel pointed out to @aliikhatami94 that they can enable callback event sending using a certain parameter, found here, stating it's not well tested and currently only supports invoke and batch.

Type Inference Issue: @veryboldbagel indicated that there are instances where type inference can get a bit strange, suggesting the use of with_types might help to define expected inputs. An example was given here.

Setup of Chat Widget: @veryboldbagel shared a link here detailing how to set up a chat widget, but also warned of a likely impending breaking change to the behavior of the output key.

▷ Channel: langchain-templates (4 messages):

Code Implementation Help: @menny9762 sought help with a piece of code where they were struggling to set {lang} parameters and mentioned that even when they manually set the language, the responses are often in English.
LangChain for Q&A on Tabular Data: @steve675 inquired if LangChain provides the ability to create a question and answer assistant for tabular data, similar to a chatbot created for documents. They explained the need for a chatbot that can handle multiple tables with different columns.

▷ Channel: share-your-work (6 messages):

Docker Setup for Research Assistant Template: @joshuasundance created a simplistic docker setup for users interested in experimenting with the research assistant template using langserve. The setup instructions have been shared on this GitHub link and the updated v1.1.0, incorporating tavily, is available on dockerhub.
BERTopic Support for LCEL Runnables: @joshuasundance highlighted that the latest version of bertopic now supports LCEL runnables, sharing excitement for potential applications and pointing to the GitHub release notes.
Request for Sales Assistant LangChain Agent Fine-tuning: @l0st__ seeks assistance for fine-tuning a LangChain agent with SQL Database access, designed to function as a sales assistant and facilitate complex sales interactions.
Utilizing Multimodal LLM for Extracting Tables and Images: @andysingal shared a Medium article that demonstrates the potential of LangChain+LlamaIndex in semi-structured data.

▷ Channel: tutorials (2 messages):

Tutorial on Langchain Deployment: @kulaone shared a new tutorial on how to deploy Langchain on Cloud Functions using Vertex AI models for scalability.
Fine-tuning LangChain Agent Accessing SQL Database: @l0st__ is looking for someone to fine-tune a LangChain agent that can interact with a SQL database and act as a sales assistant. They provided a sample conversation for clarity.

Nous Research AI Discord Summary

A discussion on Hermes formatting issues, initiated by user @teknium, noting that the formatting seemed off.
Query from @teknium regarding the correct spelling of automata, with a specific mention of user @Oll_ai.
Issue with a scraping pipeline reported by @tsunemoto as part of a project they're working on.
Proposal from @imchrismayfield about creating Nous Research stickers for laptops with confirmation from @teknium about the availability of such stickers from a past event.
@tsunemoto shared a link to a midjourney dataset dump, asked @181893595177943040 and @957871507881734184 if this was sufficient or if more was needed.
Sharing of various links of potential interest from users such as @metaldragon01, @mihai4256, @papr_airplane, @if_a, and @oozyi; most of the links were related to AI subjects but had no additional commentary or relevance provided.
Mention from @yorth_night of a YouTube video featuring GeorgeHotz working with OpenHermes on a project called "Q"; note was made about tinygrad* having increased appeal.
Discussion centered on various AI techniques like DPO and IPO for improving models, with @danielpgonzalez and @tokenbender sharing research papers on IPO and cDPO respectively.
@variav3030 initiated a dialogue on AllenAI TULU models, specifically praising the 70b DPO model and declaring it as one of the best local models they've tried.
@papr_airplane asked for a repository containing training scripts for Open Hermes; @teknium suggested Axolotl and shared the link to Axolotl's DeepSpeed config.
Addressing of training difficulties, especially Out-Of-Memory (OOM) errors, with @teknium suggesting the use of DeepSpeed over FSDP, and @besiktas questioning the usage of Fully Sharded Data Parallelism.
Conversation regarding the performance of GPT4 in comparison to smaller models; @TokenBender mentioning their new 1B model - evolvedSeeker 1.3 and its noteworthy performance in coding problems.
@teknium shared positive experience with the default arguments in LM Studio and reported satisfactory token rates on certain AI models. However, gguf did not work for him with 2 GPUs and negatively affected the speed.
Discussion around the building of an AI workstation, centering on the use of a Threadripper CPU; members concluded to go for consumer-grade CPUs for inference box with 1 or 2 GPUs.
@asada.shinon raised a query on comparing sentence distance, to which @philpax recommended embedding the sentence and doing a cosine similarity, sharing SBERT Documentation as a resource.
Comparisons of the performance of Quest models, @variav3030 found @teknium's performance superior, which @giftedgummybee attributed to exllama's different method of quantization.

Nous Research AI Channel Summaries
▷ Channel: off-topic (22 messages):

Hermes Formatting Issues: @teknium raised a concern about the formatting of Hermes seeming off.
Automata Spelling: @teknium queried about the correct spelling of automata by mentioning @Oll_ai.
Scraping Pipeline Issue: @tsunemoto mentioned about a problem with a scraping pipeline which they are working on.
Nous Research Stickers: @imchrismayfield proposed the idea of stickers for laptops and @teknium confirmed availability of Nous Research stickers from a previous event, which they will bring to the next Ollama event.
Midjourney Dataset Dump: @tsunemoto shared a link to a midjourney dataset dump of a million rows they added to a repo and asked @181893595177943040 and @957871507881734184 if this was enough or more was required. The upload faced throttling issues, causing slow speeds.

▷ Channel: interesting-links (8 messages):

User @metaldragon01 shared a link to a tweet post of potential interest.
@mihai4256 shared a YouTube video to the group, without commenting on its contents or relevance. 
@papr_airplane posted a Twitter URL, but didn't include any further details or remarks. 
@if_a also posted a Twitter link without comment.
A link to an arXiv paper was shared by @oozyi, with no additional context given.
@yorth_night mentioned that they were watching a YouTube video of GeorgeHotz working with OpenHermes for a project called "Q". They also noted that tinygrad* is getting more appealing.
In response, @vatsadev agreed that things are getting better every day.

▷ Channel: general (247 messages🔥):

Discussion on Techniques like DPO and IPO: The members of the AI Discord chatbot community were discussing about different AI techniques being used to improve models. @Casper_ai mentioned that Distributed Prioritized Experience Replay (DPO) hadn't been proven the way Ranked Limited Horizons First (RLHF) has. @teknium added that the team tried every possible technique and used the one that showed the best results. @danielpgonzalez shared a link to the paper on a new variant of DPO, IPO and @tokenbender shared a link to cDPO.

Discussion on AllenAI TULU Models: @variav3030 initiated a discussion on AllenAI TULU models and shared a link to the 70b DPO model, mentioning that it was very good and arguably the best local model they've tried. 

Queries on Training Scripts for Open Hermes: @papr_airplane asked about a repository containing training scripts for Open Hermes. @teknium suggested using Axolotl, adding that Hermes 2 was trained by @257999024458563585, and also shared the link to Axolotl's DeepSpeed config.

Training Difficulties and Solutions: There were several discussions on the troubles faced during AI model training. @besiktas questioned the usage of Fully Sharded Data Parallelism and reported experiencing Out-Of-Memory (OOM) errors. @teknium suggested using DeepSpeed instead of FSDP and shared a link to his issue regarding FSDP on GitHub.

Discussion on GPT4 vs Other Models: @TokenBender mentioned creating a new 1B model - evolvedSeeker 1.3 - and its impressive performance in coding problems. @danielpgonzalez expressed surprise at the progress made by 1B models closing in on ChatGPT performance. There was a broader discussion about the performance decrease in GPT4's new versions and potential reasons behind it. Many users expressed dissatisfaction with GPT4's recent performance.

▷ Channel: ask-about-llms (57 messages):

Using LM Studio and AI Models: @teknium shared his satisfactory experience with the default arguments in LM Studio, a GUI that uses llama.cpp as backend. His customized arguments: temp 0.8, rep penalty 1.1, top p 0.95, and top k 40 achieved satisfactory token rates on the gptq exllama-1 and OpenHermes-2.5-Mistral-7B-exl2 models with two 4090s GPUs even though one is x8. However, gguf doesn't work for him with 2 GPUs and ruins the speed.

Considerations When Building AI Workstation: Users discussed the need (or lack thereof) for using a Threadripper CPU. Most concluded that if you are building an inference box with 1 or 2 GPUs, just go for consumer-grade CPUs. @coffeebean6887 warned against buying a Threadripper to "future proof" the system as the consumer hardware is much cheaper and holds its value quite well for resale. 

Resource Use for LLMs: @night_w0lf indicates that having enough system RAM to fit the larger models into memory is the most important aspect when working with LLMs, especially when trying to do quantization or model merging. 

Comparison of Sentences: @asada.shinon raised a query about how to compare sentence distance. @philpax recommended embedding the sentence and then doing a cosine similarity. He linked to the SBERT Documentation as a potential resource.

Performance of Quest Models: @variav3030 compared his performance metrics with those of @teknium, and found that @teknium's was much better. @giftedgummybee explained that exllama uses a different method of quantization that is computationally expensive, but preserves as much performance as possible. Comparing the performance of openhermes 2.5 at 8-bit quant on gguf on LMstudio, @variav3030 achieved 47 tok/s.

Alignment Lab AI Discord Summary

@magusartstudios shared a YouTube link across multiple channels (general-chat and fasteval-dev), but without providing any contextual details.
The shared content triggered a reaction from @ldj in the general-chat channel, who described it as "cursed".

Alignment Lab AI Channel Summaries
▷ Channel: general-chat (2 messages):

Shared Link: @magusartstudios shared a YouTube link.
User Reaction: The shared link received a reaction from @ldj, describing it as "cursed".

▷ Channel: fasteval-dev (1 messages):

@magusartstudios shared a YouTube link without providing any additional context.

Skunkworks AI Discord Summary
Only 1 channel had activity, so no need to summarize...
Skunkworks AI Channel Summaries
▷ Channel: off-topic (1 messages):

User @pradeep1148 shared a YouTube link: https://www.youtube.com/watch?v=oqMWrDjbkFI.

LLM Perf Enthusiasts AI Discord Summary

Discussion on the GPT-4's code quality and performance issues, with participants expressing dissatisfaction about partially completed tasks and slower inference. Dialogue around API's latency and maintainability, and suggestions for managing consistent response latency. 

Notable quote from @potrock: "GPT-4 has become poor at implementing functions and often leaves placeholder comments."
User @robotums proposed a technique to obtain consistent latency: "submitting all OpenAI requests at the same time to ensure they get sent to the same batch in the queue."

Conversation about Q* technology, featuring a shared Google Doc filled with speculation and chatter regarding evaluation of the speculative content. A discussed AI behavior in image generation: the AI tends to produce psychedelic space images when asked to make an image "more x."

GDoc: Q* Speculation
Quote from @pantsforbirds: "When prompted to make an image "more x", the AI tends to converge to some sort of 'psychedelic space image'."

Conversation about AI Dungeon Master for gaming platform Fables, including the challenges, examples of failures, solutions like generating a high-level plot line, and discussions on handling user deviations.

Fables: Fables.gg

Suggestions for speed alternatives to Serpapi, with mentions of Metaphor. Metaphor was lauded for its faster return times, neural, vector-based search, and capability to return HTML contents.

Serpapi: Serpapi.com

Detailed discussion about PDF-to-RAG/LLM Pipeline cost structure, focusing on the cost distribution change with GPT-4 pricing. Conversations about the choice of OCR software and shared experiences working with Azure OCR and AWS Textract. Additionally, discourse about the challenges of processing multicolumn data.

Shared post: Amazon Textract's new layout feature

Job opportunity at Synthflow.ai for an AI Engineer role, responsible for leading product development, research, and technical architecture.

Job posting: AI Engineer

Connect call by @thebaghdaddy to meet with anyone located in Vermont.

LLM Perf Enthusiasts AI Channel Summaries
▷ Channel: gpt4 (24 messages):

Concerns over GPT-4's code quality: @potrock expressed dissatisfaction about GPT-4, stating it's become poor at implementing functions and often leaves placeholder comments, making it harder for work. @pantsforbirds also found it frustrating, as GPT-4-Turbo often only partially completes tasks and left the rest for the user. 
GPT-4's speed and performance: @pantsforbirds and @res6969 mentioned the inference speed of GPT-4-Turbo via chat-gpt has become slower, especially during high traffic periods. @evanwechsler speculated that the decrease in performance might be intentional constraint due to scaling challenges until hardware improves.
Impacts of scaling issues on GPT-4: @nosa_. hypothesized that the reduction in GPT-4 performance may have been an attempt to optimize latency & cost, as internal evaluations might not have been representative of everyone's use cases. However, @pantsforbirds also agreed with scaling being the likely cause but wasn't sure whether this was a deliberate move.  
API performance discussion: @res6969 thought the API's latency has gone worse but has not affected the quality. Simultaneously, using fixed seeds for queries, @pantsforbirds noticed more consistent performance.
Consistency and response latency: @robotums proposed a technique to obtain consistent latency: submitting all OpenAI requests at the same time to ensure they get sent to the same batch in the queue.

▷ Channel: offtopic (9 messages):

Q* Speculation: @res6969 shared a Google Document containing speculation about Q* technology. He posted it for fun, acknowledging the information is likely to be 'unhinged and untrue'.
Evaluation of The Speculation: @pantsforbirds and @justahvee expressed skepticism about the speculations, with @justahvee finding something curious about how the email suggests evaluating an AI system.
Image Generation Discussion: @pantsforbirds initiated a discussion on image generation. He noted that when prompted to make an image "more x", the AI tends to converge to some sort of 'psychedelic space image', and asked for other members' thoughts on this behavior.

▷ Channel: collaboration (11 messages):

AI Dungeon Master Discussion: @thisisnotawill shared the challenge of creating an AI dungeon master for the platform Fables, with issues existing in maintaining a cohesive narrative and including user choices in story progression. This generated a discussion on potential solutions for adaptive plot creation. 
Examples of Failures: @justahvee asked for examples of failures in the current model implementation. @thisisnotawill responded that while they had examples, they were not easily sharable due to lengthy chat histories. The main complaint from users revolved around non-progressive or unrelated narrative direction from the AI. 
Plot Creation Suggestions: @justahvee suggested the generation of a high-level plot line by AI that incrementally unfolds, but noted that this approach still required strategies for handling dynamic plots and elements of improv. 
Discussion about Handling User Deviations: Addressing concerns that players might want to deviate from a pre-planned arc, @thisisnotawill highlighted the necessity of balancing planning with narrative flexibility. @justahvee responded with the advice to focus on a narrower problem first, such as asserting AI's own plot with a user, before dealing with more complex issues.
Another AI Dungeon Master Project: @pantsforbirds shared that they were also working on a similar DnD tool that acts as a real-time teleprompt for the DM. The teleprompt would provide possible details when a user asks about a room, NPC, etc. @thisisnotawill expressed interest in the project.

▷ Channel: speed (4 messages):

Faster Alternative to Serpapi: @23goat asked for a faster alternative to Serpapi, which takes around 2 to 3 seconds to retrieve the top 5 links for a query.
@potrock suggested Metaphor instead of Google Serp, mentioning that queries will need to be redone.
Metaphor for Quick Searches: @jeffreyw128 recommended Metaphor, highlighting its faster return times, neural, vector-based search, and capability to return HTML contents.

▷ Channel: cost (13 messages):

Cost Structure of Running a PDF-to-RAG/LLM Pipeline: @res6969 discussed the costs involved in running a PDF-to-RAG/LLM pipeline, stating that OCR through Azure is currently responsible for 52% of their per-document cost and the new GPT-4 pricing accounts for the remaining 48% of the cost. Over time, the proportion attributed to OCR is expected to increase (message).
The Previous Cost Distribution: The cost distribution used to be 27% OCR and 73% OpenAI according to @res6969, implying a substantial change with the introduction of GPT-4 pricing (message).
OCR Software Choices: Azure OCR and AWS Textract were compared, with @res6969 explaining that they moved from Textract to Azure due to better support and higher rate limits with the same cost. @pantsforbirds was considering a move to AWS Textract and mentioned encountering issues with another OCR program, Nougat (messages).
Multicolumn Data Processing: @degtrdg and @pantsforbirds discussed the challenge of processing multicolour data with OCR, particularly academic papers with two-column formats. AWS Textract was mentioned as having some multicolumn support, though the term "nice" was used reservedly by @pantsforbirds (messages).
Reference: @pantsforbirds shared a blog post about Amazon Textract's features for AI document processing tasks, particularly with respect to multicolumn data: Amazon Textract's new layout feature (message).

▷ Channel: jobs (1 messages):

Job Opportunity at Synthflow.ai: User @rabiat shared a hiring notice from Synthflow.ai for an AI Engineer role. The job involves leading product development, research, and technical architecture for Synthflow's new AI-powered platform. You can check out the job posting here.

▷ Channel: irl (1 messages):

Connecting Users in Vermont: User @thebaghdaddy raised a query to connect with anyone located in Vermont.

MLOps @Chipro Discord Summary
Only 1 channel had activity, so no need to summarize...
MLOps @Chipro Channel Summaries
▷ Channel: general-ml (2 messages):

User @wangx123 shared a YouTube link in the general-ml channel
User @c.s.ale commented on @wangx123's post, likely remarking on the distribution of the link across multiple Discord channels.

The Ontocord (MDEL discord) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

AI Engineer Foundation Discord Summary

Discussion on Git practices within the guild, including @hackgoofer's mention of @pwuts's "idea" without specified details.
Debate about direct commits to the master branch and the use of Git precommit hooks. @kasparpeterson raised concerns about linting errors and CI failures in the AI-Engineer-Foundation repository, suggesting adherence to a git PR flow. @._z admitted to their incorrect setup and acknowledged the linting issues.
Suggestions by @kasparpeterson to enforce existing contributing guidelines rather than revising them. Key proposal includes incorporating GitHub actions for automatic CI jobs on each push to improve code quality.
Announcements and reminders related to the AI Engineer Foundation (AIEF) weekly meeting by users @hackgoofer and @._z, with a follow-up link to the event.
Detailed discussion about authentication integration with the core protocol. @kasparpeterson described his viewpoint, sharing it via a GitHub issue. The idea was supported by @ntindle and @juanreds, with pushback from @hackgoofer suggesting plugin use instead of adjusting the main protocol.

AI Engineer Foundation Channel Summaries
▷ Channel: general (10 messages):

Discussion on Git Practice: @hackgoofer initiated a discussion involving @pwuts and their idea, noting it was discussed in a meeting. The details of the "idea" weren't specified.
Commits to Master Branch and Git Precommit Hooks: @kasparpeterson raised concerns regarding 6 commits made directly to the master branch of the AI-Engineer-Foundation. They specifically tagged @._z and asked about git precommit hooks, pointing out multiple prettier and eslint errors and the failure of CI. They also suggested following the git PR flow.
Response to Git Hooks and CI errors: @._z responded explaining why PR flow wasn't followed and acknowledged the lint errors pointed out by @kasparpeterson. They further indicated that their setup was likely incorrect and invited further discussion on revising the contributing guidelines.
Enhanced Validation in Git Practices: @kasparpeterson proposed enforcing existing contributing guidelines instead of revising them. They suggested implementing GitHub actions for CI jobs on every push to lint schemas and validate the schema, as a way to ensure code quality.

▷ Channel: events (10 messages):

AIEF Weekly Meeting Announcement: @hackgoofer shared a link to the AI Engineer Foundation (AIEF) weekly meeting and invited members to join.
Discussion on Auth Integration in Core Protocol: @kasparpeterson expressed his viewpoint that auth should be part of the core protocol, sharing his detailed reasoning on a GitHub issue. @hackgoofer raised concerns about changing the main protocol and suggested considering plugins as an alternative.
Community Response to Auth Topic: @ntindle and @juanreds agreed with the idea of integrating auth in the main protocol for enhanced interoperability, scalability and maintainability.
Meeting Reminder and Start: A reminder was shared by @._z, notifying members that the meeting was about to start.

The Perplexity AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.

YAIG (a16z Infra) Discord Summary
Only 1 channel had activity, so no need to summarize...
YAIG (a16z Infra) Channel Summaries
▷ Channel: ai-ml (2 messages):

Latest AI Research on YouTube: @nickw80 recommended a YouTube video for its coverage on the latest AI research. They noted that "about 50% of the way through it's getting into the latest research from the past couple of weeks."

                            Don't miss what's next. Subscribe to AI News (MOVED TO news.smol.ai!):

            Email address (required)

                Share this email:

                                Share on Twitter

                                Share on LinkedIn

                                Share on Hacker News

                                Share on Reddit

                                Share via email