[AINews] AI Discords Newsletter 11/16/2023
This is AI News! an MVP of a service that goes thru all AI discords/Twitters/reddits and summarizes what people are talking about, so that you can keep up without the fatigue. Signing up here opts you in to the real thing when we launch it 🔜
Guild: Latent Space
Latent Space Guild Summary
- Discussion initiated by @coffeebean6887 on model routing companies like Martian, Open Router, and Pulze, with a request for advice and personal experiences related to these services.
- A conversation started by @mitch3x3 regarding potential copyright issues within AI, proposing the likelihood of various licensing models akin to those seen in the coding domain.
- @swyxio shared an insightful Hacker News post talking about fine-tuning results in the realm of AI.
- A continuing debate related to the Yi 01 model, asked by @swyxio, with a link to a corresponding discussion on Hugging Face.
- User @mockapapella discussed their trials with generating descriptions for a vast number of part numbers using local models like Mistral, Zephyr, Llama 2, and issues encountered, including memory and runtime problems. They've also tried the Llama 2 C++ model and ChatGPT-3.5-1106 Turbo.
- Discussion revolving around inventive terminologies to describe generated and tweaked codes in programming, initiated by @slono, introducing terms like "Generacoded" and "Prograsummoned."
- Dialogue between @tiagoefreitas and @swyxio about the utility of the tool codegen.com for code generation.
- @swyxio posted a link to an blog post examining the expenses and market demand concerning AI products, citing the company elevenlabs as a successful example.
- A conversation in the LLM Paper Club led by @youngphlo regarding a specific research paper potentially being the reason for a surge in interest or development in AI.
- @swyxio's query on whether a certain 'code fusion' paper has been talked of, though the exact paper remains unspecified.
Latent Space Channel Summaries
Latent Space Channel: ai-general-chat (2 messages):
(Discussion on AI, Model Routing and Finetuning):
-
Model Routing Companies: @coffeebean6887 started a discussion about companies doing model routing such as Martian, Open Router and Pulze, seeking advice and personal preferences from others.
-
AI and Copyright Issues: @mitch3x3 discussed the potential implications of copyright lawsuits on AI, suggesting the future adoption of different types of licenses similar to those in the coding industry (e.g., MIT, GNU, Apache2.0, etc).
-
Finetuning Results: @swyxio posted a link to an interesting Hacker News post about finetuning results.
-
Discussion on the Yi 01 Model: Conversation continues with @swyxio drawing attention to a debate around the Yi 01 model. He shared a link to a discussion on Hugging Face about the issue.
-
AI for Descriptions Generation: User @mockapapella discussed their attempts to generate descriptions for a large list of part numbers using local models, facing difficulties with memory and runtime issues. They tried different models including Mistral, Zephyr, and Llama 2, also using Llama 2 C++ model and ChatGPT-3.5-1106 Turbo.
-
AI in Coding: The chat moved to discussing terms for describing generated and tweaked code in programming, suggested by @slono, with terms like "Generacoded" and "Prograsummoned".
-
Tools for Code Generation: Users @tiagoefreitas and @swyxio discussed the use of codegen.com, with no further additions to the discussion.
-
AI in the Market: @swyxio shared a link to a blog post that analyzes the costs and market demand related to AI products, mentioning the successful case of the company elevenlabs.
Latent Space Channel: llm-paper-club (2 messages):
Discussion on Recent AI Publications:
- @youngphlo highlighted a specific research paper as a possible reason for a recent surge in interest or development in the AI field.
- @swyxio inquired if a certain 'code fusion' paper had been discussed, although it's unclear which paper is referred to.
Guild: OpenAI
OpenAI Guild Summary
-
In the ai-discussions channel, users addressed issues including concerns over GPT's learning capacity, user interface changes in platform.openai.com, the minimum amount required to access GPT-4, how to limit chatbot conversation using the Assistant playground, integrating GPT model with existing API, and unresolvable problems with the threads endpoint in Open AI.
-
In the openai-chatter channel, various issues were discussed. These included performance of GPT-4, visibility of 'Threads' and 'Messages' in the OpenAI Platform UI, several bugs and functionality issues in GPT-4, confusion on the availability of ChatGPT Plus and GPT-4 for free users, issues encountered on ChatGPT, and API and embedding related topics.
-
openai-questions channel was a space for users to report problems accessing the "My Plan" section, difficulties with using ChatGPT, tendencies of ChatGPT to stop when writing code, and other issues like ChatGPT not being able to perform complex queries. A suggestion was made to move any persistent issues to the bug-reports channel.
-
Over in the gpt-4-discussions channel, there was a lively discussion around GPT-4's performance, updates, issues, its reduced message cap, saving issues, fine-tuning, embedding custom GPTs on websites, and problems encountered with loading more GPT's and accessing chat session history.
-
prompt-engineering channel discussions were focused on various aspects of using prompts, including using Markdown for AI UI, strategies for formulating instructions, and the limitations of the AI. Links to AI Writing GPTs were shared and a discussion broke out about whether to refer to GPT as 'You' or 'The GPT' in writing instructions for custom models.
-
User queries in the api-discussions channel were prominent, with questions around using Markdown for AI user interface simulations, optimizing GPT usage, and handling date formatting. Various writing GPT's were recommended and shared in the channel, with discussions on GPTs created through the ChatGPT web UI, embedding custom GPTs on websites, and understanding APIs. A discussion also unfolded on whether to use "You" or clarify it as "The GPT" when referring to the AI in prompts.
OpenAI Channel Summaries
OpenAI Channel: ai-discussions (6 messages🔥):
Topic 1: Training GPTs Agent:
- User @tilanthi shared a concern about GPTs agents not learning from additional information provided after their initial training. @solbus cleared this misunderstanding, explaining that uploaded files are saved as "knowledge" files for the agent to reference when required, but they do not continually modify the agent's base knowledge.
Topic 2: User Interface Changes on Platform:
- @zahmb and @foxabilo had a discussion about changes in the sidebars of platform.openai.com. @zahmb reported that two icons, one for threads and another one for messages, disappeared from the sidebar.
Topic 3: Accessing and Using GPT-4:
- @ishaka02 and @elektronisade clarified the minimum amount ($1) required to access GPT-4. They also discussed preferences for parameters and whether the GPT-4 model offers any significant advantage over the GPT-4-1106 model.
Topic 4: Assistant API Functionality:
- @crash.tech sought ways to restrict chatbot conversation to only their predefined functions using the new Assistant playground. Some users suggested using fine-tuning, while others mentioned adding specific instructions.
Topic 5: Building API Functionality Using GPT Models:
- User @kyper is looking for a way to integrate a GPT model with their company's existing API. They plan to capture specific command responses from the GPT model using middleware and assume a custom fine-tuned model can help achieve this.
Topic 6: Trouble with Open AI Threads Endpoint:
- User @criptobroh reported problems with the response from the AI after running a POST THREAD command to the Open AI Threads Endpoint. Other users including @elektronisade and @_ciphercode engaged in the discussion but weren't able to resolve the issue.
Relevant Links:
OpenAI Channel: openai-chatter (6 messages🔥):
- Discussion on various GPT-4 issues: Users including @becausereasons, @satanhashtag, and @Aris | Something.Host discussed problems they have been encountering with GPT-4 performance. Issues included lack of basic functionality and reduced message limits.
- Query about visibility of 'Threads' and 'Messages' in OpenAI Platform UI: @zahmb and @foxabilo discussed visibility of 'Threads' and 'Messages' in the navigation pane of Platform OpenAI UI. Zahmb asked if these elements are visible to everyone, which they had briefly noticed before the features disappeared.
- Possible bugs and functionality issues in GPT-4: Users @stunspot, @hackxit, @dexter.js, and @ciphercode discussed various bugs and problems they have faced. They mentioned issues with loading 'My GPTs', possible violation of 'fair-use' or 'terms-of-use', inability to download conversation history in the OpenAI Playground, and issues with accessing 'My Plan' page.
- Confusion over availability of ChatGPT Plus and GPT-4 for free users: @hihi110, @elijah.ig, @aedoesthings, and @york6699 had a discussion on the availability of ChatGPT Plus and GPT-4 for free users. Overall confusion was evident due to free user limitations.
- Various issues encountered on ChatGPT: Users including @rewire, @.pythagoras, @rainy08, and @zahmb reported different issues they encountered while navigating through ChatGPT, including missing chat history, an unknown cap on message count, disappearance of 'Threads' and 'Messages' from UI, and page loading issues.
- GPT-3.5-turbo and GPT-4 model discussions: @alice5, @libertadthesecond, and @ciphercode discussed various aspects of GPT-3.5-turbo and GPT-4 models. Topics included open-sourcing of models, local hosting, model requirements, and model performance.
- Issues with 'My Plan' accessibility and 'Oops' error: Various users expressed issues related to accessing the 'My Plan' page and encountering an 'Oops' error. Affected users included @antony0000, @milou4dev, and @libertadthesecond among others. Discussants speculated about possible causes, including capacity issues, but no concrete resolution was identified.
OpenAI Channel: openai-questions (6 messages🔥):
ChatGPT Query Issues & Feedback:
- Users reported having problems accessing the "My Plan" section in their accounts settings. The page doesn't load on different browsers and devices. Reported by @mrbr2023, @thesocraticbeard, @sumo_dump, @fujikatsumo, @highsquash, and several others. There was no resolution to this issue within the provided context.
- @danielbixo is experiencing issues with accessing ChatGPT on their PC while it works well on Android. @xh suggested clearing cookies and signing in again, trying a different server/tunnel if using a VPN, and trying a different browser.
- A trend of ChatGPT stopping while writing code was reported by @playbit and confirmed by @thesocraticbeard who suggested using a different method of reading the CSV or converting the CSVs to JSON as a last resort.
- @vdrizzle_ experienced issues with ChatGPT and asked for feedback. @xh suggested they could clear their cookies and sign in again, or if @vdrizzle_ is using ublock origin, to refresh the filters.
- Discussion about making GPT 'dumbify' the outputs to just text. @solbus and @syndicate47 helped @xh resolve the problem by suggesting to ask the model not to output in LaTeX and only output in plaintext.
- OpenAI users @katthwren and @dsmagics reported being locked out of their Plus subscriptions despite showing they have Plus, noting the issues seemed to begin after an update to the ChatGPT app. Issues were reported to the bug-reports channel but no resolution was identified.
- @xzz3300 requested assistance on how to connect the ChatGPT API with Discord. @foxabilo redirected them to the chatbot-tutorial channel for guidelines.
- Several users reported their publicly published GPT models not appearing under "My GPTs". The issue was shared by @patriciocomplex, but no solution was provided.
- An issue about ChatGPT not being able to answer more complex queries was shared by @jobodro. It was suggested this might be because of a reported bug that allowed free users to access GPT-4, potentially causing system overloads.
- @justabeastfromthelittleeast inquired if @thesocraticbeard was an OpenAI moderator; @baugrems confirmed he technically was but preferred not to moderate discussions.
- @solbus suggested that users experiencing persistent issues should post their experiences and details in the bug-reports channel where OpenAI staff frequently appear.
Please note: this summary is exclusively based on the provided text. Any inaccuracies or missing information are due to the confines of the disclosed messages.
OpenAI Channel: gpt-4-discussions (6 messages🔥):
GPT-4 Performance, Updates, and Issues:
- Users have reported a 25 message cap in the GPT-4, which seems to have been reduced from a previous 50 message cap every 3 hours (@cybector, @mustard1978).
- Users have reported saving issues with their custom GPTs, for instance, when they add or edit instructions (@thesocraticbeard, @lukehunt)
- Discussion arose around fine-tuning and forcing GPTs to follow instructions more consistently. Users shared experiences and approaches to making their GPTs reference the source material more consistently and follow instructions more accurately (@stealth2077, @notcredibleyet, @gubok).
- @amarcerro asked about ways to check the performance and usage of individual GPTs. It was suggested that a rough estimate can be made by subtracting the number of one's own conversations with the bot from the total number of conversations indicated next to its draft (@mekollik2)
- There are reports of GPTs not working as expected, either not executing expected actions or behaving inconsistently (@citizenscientist).
- Issues with the "load more" functionality for viewing all GPT's were reported (@loschess, @artofvisual).
- There are reported issues accessing chat session history also (@howardlovy).
API and Embedding:
- @gubok inquired about the difference between using GPTs through the API and the Assistant API, and whether GPTs created through the ChatGPT web UI could be accessed using API calls.
- @gordon.freeman.hl2 voiced a wish for embedding custom GPTs on websites, pointing out a high demand for such a feature.
- @hhf0363 asked for an explanation of what an API is for non-developers. @Derpgore explained it as the language applications use to communicate with each other.
Miscellaneous Discussions:
- A few more specific applications, glitches, and functionalities of GPT were discussed, such as generating a Spotify playlist (@Foufou), creating a custom GPT to run Stockfish (@soapchan), managing the API actions and auth (@bacaxnot, @amarnro), whether GPT4 Vision can accurately identify a font (@charl0sk), and more.
- @chromeio shared a trick to use the GPT-4 model on the ChatGPT platform without needing a ChatGPT Plus subscription.
- @captainstarbuck advised the community to set proper expectations for what GPTs can do, explaining that they aren't technically different from the standard ChatGPT (beta). They shared a useful resource, a YouTube video by The AI Breakdown.
OpenAI Channel: prompt-engineering (6 messages🔥):
Prompt Engineering Discussions and Questions:
-
Using Markdown for AI User Interface: @mnjiman proposed the idea of using a Markdown table to simulate an 'option menu' within an AI conversation. This could help convey particular 'contexts'.
-
Customizing GPT Parameters in UI: @no.iq, inquired where to modify parameters such as 'temperature' in the new UI.
-
Generating Poems from Arguments: @ertagon provided an example of a chat where GPT was asked to turn a regular argument statement into a poem.
-
Retrieving Multiple Instructions from a Website: @goldmember777 brought up a question about minimizing the number of times GPT looks for online resources when extracting step by step instructions from a website.
-
AI Writing GPTs: @.kalle97 shared their AI Writing GPTs, which produce SEO-optimized long articles, providing links to the models here, here, and here.
-
Refering to GPT in Instructions: @jungle_jo sparked a discussion on whether to refer to GPT as 'You' or 'The GPT' in writing instructions for custom models.
-
Extracting and Formatting Dates from Text: @alishank_53783 shared their struggle with prompt engineering for extracting and formatting dates from unstructured email text. @eskcanta provided solutions and examples to help, highlighting the importance of being clear and specific in instructions to avoid confusion.
-
Do GPTs Overvalue Concepts in Their Last Phrase?: This question was asked by @mnjiman regarding how GPT sums up its own response, which can sometimes seem like a distraction.
-
Chatbot Style AI vs API: @eskcanta brought up a discussion on the different use-cases and skills required for chatbot style AI (which they are familiar with) and API-oriented tasks.
-
Noted AI Limitations: @eskcanta also highlighted a limitation of GPT - Dates in a conversation don't have much contextual meaning to the model.
-
Positive vs Negative Prompts: @jungle_jo suggested that it's better to always use positive prompts and avoid negative ones.
-
Simplicity in Prompts for Image Description: @jungle_jo raised a topic about the simplicity of prompts when describing images. @eskcanta stressed the importance of clear and specific instructions, warning that vague or error-ridden instructions lead to the model making more guesses, and thus potentially incorrect outputs.
[['Overall, the prompt-engineering channel had a quite diverse discussion ranging from philosophies of using positive prompts to specific problems like adjusting parameters in the new UI, handling instructions from websites and extracting and formatting dates from text. There was a strong emphasis on the importance of clear and specific instructions throughout the discussions.
OpenAI Channel: api-discussions (6 messages🔥):
"Using GPT to Facilitate Tasks": - @mnjiman suggested using simulation and markdown table presentations to facilitate certain tasks in discussions, admitting the solution may be a bit messy but potentially useful. Discussion yet to be further developed.
"Parameters and UI Changes Concern": - @no.iq asked where to find and tweak parameters like 'temperature' in the new UI, noting that these options are not visible on the right in the Playground.
"Prompting Strategies and Instruction Formatting": - @ertagon provided a step-wise example of shifting prompt formats. Moving from a single paragraph argument about tomatoes not existing, to transforming that argument into a poem. - @goldmember777 sought advice on how to retrieve multiple instructions from a website but only share with the user one at a time, aiming to optimize GPT usage and prevent repetitively looking for resources. - @jungle_jo raised a question regarding preferences for referring to AI in prompts, specifically whether to use "You" or clarify it as "The GPT".
"Creating And Using GPTs for Writing Purposes": - @kalle97 recommended using their AI writing GPTs, claiming they are SEO-optimized and capable of producing long articles. Shared links to several versions: - GPT-1 - GPT-2 - GPT-3 - @syndicate47 likewise encouraged testing out their prompt engineer GPT: Prompt Engineer GPT
"Handling Date Formatting Using AI": - @alishank_53783 encountered difficulty prompting for date formatting while parsing emails. Sought advice on ensuring a 'YYYY-MM-DD' date format. - @eskcanta suggested specifying 'YYYY-MM-DD, which is year-month-day', and offered to work on a shared example solution for formatting date.
"General Discussion on AI Usage and Understanding": - @mnjiman brought up concerns over ChatGPT's tendency to summarize its own responses in the last sentence/paragraph, fearing an overemphasis on last phrase concepts. - @eskcanta viewed this as more of a stylistic trait of the AI. Encouraged trying different ways of reply formats and gave examples of how to do so. - @jungle_jo brought up the idea of using positive prompts and straightforward descriptions when prompting images. - @eskcanta advised that the clarity and specificity of instructions depended much on the goal and the model's comprehension, recommending specificity especially for reproducible results.
Guild: LangChain AI
LangChain AI Guild Summary
-
Discussions about LangChain's utility and features, with users like @ethereon_, @jinwolf2, and @abhi578 expressing concerns and seeking clarifications on various topics such as chaining multiple chains in LCEL, error handling, and LangChain embedding generation. Similarly, ~~@QuantumQueenXOX posed queries about LangChain's web loaders while ~~@beffy and @endo9001 discussed issues with VectorStore and saving ConversationBufferMemory, respectively.
-
A significant discussion pertained to Langchain's role in web applications, ignited by users like @alimal and @0xtogo who explored its use in tabular data question-answering tasks and using an existing assistant without creating a new one, respectively.
-
Conversations also arose on the technicalities of LangChain's operation. @syntactic__, @seththunder, @tonyaichamp, and @eyueldk invested their time in understanding the nuances of chaining multiple chains, differentiating between conversation methods, commenting on LangChain's speed, and accessing the sources or references used by a LangChain agent.
-
@Andriusem and @attila_ibs discussed about handling .env file loading in Python scripts.
-
@veryboldbagel suggested many valuable insights such as upgrading to the latest version of LangChain for JSON encoding, and setting up RAG over User-uploaded files by creating endpoints for ingestion along with the use of RedisStore as the docstore.
-
Different projects and applications were mentioned in the Share Your Work channel, these included @taranjeetio's YouTube GPT, @appstormer_25583's Appstorm.ai with examples of GPTs created using their service, @.broodstar's Sophists App Sale, @kingkookri's Pantheon Platform Update, and @agenda_shaper's Agenda Shaping Platform.
-
There was also a speculation regarding the implication of the recent collaboration between LangChain and Microsoft. @juan_875 sketched their concerns about whether this would lean LangChain towards Azure over Google Cloud or AWS.
-
Finally, @maverick5493 voiced his query about uploading files into an OpenAI ‘retrieval’ assistant using Langchain in order to replicate Open AI's file creation process.
LangChain AI Channel Summaries
LangChain AI Channel: general (4 messages):
LangChain AI Discord Chat Summary:
-
LCEL Chain Construction: @ethereon_ inquired about the possibility of chaining multiple chains together in LCEL syntax, and @syntactic__ suggested it could be possible.
-
Error Handling: User @jinwolf2 asked for help regarding an error faced during response conversion in their Python code.
-
Langchain and APIs for Tabular Data QA: @alimal initiated a discussion on how to effectively implement Langchain and APIs for tabular data question-answering tasks. @brio99 asked for more clarification on the topic.
-
Issue with _aget_relevant_documents(): Siddhi Jain encountered an error while implementing a ConversationalRetrievalChain due to missing 'run_manager' argument.
-
Embedding Generation for Context-Query-Response: @abhi578 sought help regarding their use case involving contextual query response and requested resources or explanations for Langchain embedding generation.
-
Use Existing Assistant: @0xtogo posed a query about using an existing assistant through Langchain without the need to create a new one.
-
Implication of Microsoft Collaboration: @juan_87589 raised queries regarding the recent collaboration with Microsoft. They were curious if this would tilt Langchain's favor towards Azure over Google Cloud or AWS.
-
Difference between conversation.run/apply/invoke/batch(): @seththunder asked for an explanation to distinguish between different conversation methods in Langchain.
-
Speed of Langchain: @tonyaichamp commented on the slow speed of Langchain on a particular day.
-
Source/Reference Access: @eyueldk wanted to understand if there was a way to access the source or references used by a Langchain agent.
-
Saving ConversationBufferMemory: @endo9001 sought a solution to save ConversationBufferMemory into a json file for future reuse.
-
Problem with VectorStore: @beffy22 faced issues with vectorStore.asRetriever() always returning documents.
-
Web Loader Scope: @quantumqueenxox asked for guidance regarding which Langchain web loader to use when they want to load an entire website, not just the first page.
-
Uploading Files: User @maverick5493 required guidance to uploading files into an OpenAI ‘retrieval’ assistant using Langchain, essentially trying to create an equivalent code for Open AI's file creation process.
- Links:
- Langchain Expands Collaboration with Microsoft Jame's inquiry clarified and provided information on the recent collaboration with Microsoft.
- Links:
LangChain AI Channel: langserve (4 messages):
Handling .env file Loading in Python Scripts:
- @andriusem queried about properly loading .env files in Python, specifically wondering about whether to add it in the
chain.py
orserver.py
script. - @attila_ibs advised that .env files should be handled in
server.py
by importing and loadingdotenv
as follows:from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
- @andriusem expressed gratitude for the response.
LangChain AI Channel: langchain-templates (4 messages):
(LangChain Updates and Utilization):
-
Upgrading and JSON encoding in LangChain: @veryboldbagel recommends upgrading to the latest version of LangChain, which uses orjson and should serialize an ndarray. Bagel also reminds users that they need to ensure that what they're sending over the wire can easily be serializable as JSON, and decodable on the other side.
-
Setting up RAG over User-uploaded files: @veryboldbagel suggests creating extra endpoints for ingestion of files in order to set up RAG over files uploaded by users. The docstore can use Redis and also provides flexibility for implementing user's choice of persistence.
- Links:
- RedisStore Documentation - Detailed guide on using Redis as persistence option in LangChain.
- Links:
LangChain AI Channel: share-your-work (4 messages):
"LangChain AI Share Your Work Updates":
-
Youtube GPT Project Announcement: @taranjeetio shared a tweet about creating a YouTube GPT that keeps users updated about the latest videos.
-
Appstorm.ai Introduction: @appstormer_25583 introduced Appstorm.ai, a platform that allows users to build custom GPTs for free. They shared several examples of GPTs they've created with simple prompts:
-
Sophists App Sale: @.broodstar announced they're selling a specialized text messaging app, Sophists, which allows users to save and share long text message conversations. The app is integrated with a GPT chatbot and is fully deployed on the App Store and Google Play. The seller gave a link to a demo with the chatbot in the app.
-
Pantheon Update: @kingkookri posted an update on Pantheon, a platform designed to provide highly technical answers specific to user documents. They're still looking for more people to try it, especially in the areas of STEM. Test users can try it out with 50 free queries. Feedback can be given on their discord.
-
Agenda Shaper's Platform: @agenda_shaper briefly mentioned a platform equipped with an algorithm that creates posts.
Guild: Nous Research AI
Nous Research AI Guild Summary
- Discussion on massive image data hosting: Members of the channel explore different options like Amazon S3, local storage, and Hugging Face for hosting TBs of image data from midjourney. The group suggests using Hugging Face due to its free storage and high file size limit but acknowledges the risk of a single point of failure. A relevant YouTube video and a discussion post from Hugging Face were shared for more insights.
- An engaging dialogue took place concerning AI and music transformation, with reference to Google's project, followed by deep disappointment in AI cutting-edge technology not being open source. AI's potential in game playing, especially at pixel-level was also touched upon through sharing an old Python project.
- Notable references include a "Skeleton of thought" paper and a similar "Tree of thought" concept for dataset generation. A link was provided for more understanding.
- A new Text-to-Video research from Meta was discussed by members, comparing it to 'animate-diff', and the Sinthia-v1.3 dataset available on Huggingface was shared.
- OpenHermes 2.5 made an exciting announcement of achieving a high ranking on the HF Leaderboard, securing second place in the 7B models category.
- Interesting probability questions were posed as challenges, and satisfactory feedback was provided for Claude v2.
- Members of the general channel shared a variety of topics ranging from AI music generation, fine-tuning and training difficulties, operations of Nous Research, to AI models training resources guiding towards Axolotl, and urging contributions to "gptslop" on GitHub.
- The release of a new Capybara 34B API and playground was excitedly introduced.
- The origin and purpose of LLMs were discussed among users, touching on such topics as Rust code analysis, full finetuning versus continued pre-training, non-Roman language training, and application in fiction text data. The role of Chaos in AI was humorously mentioned.
- Through a shared link from Hugging Face's NLP course, it's guided on how to train a tokenizer.
- Reactions were provided for a post titled "Artificial Intelligence can't deal with Chaos" from the Y Combinator News.
Nous Research AI Channel Summaries
Nous Research AI Channel: off-topic (7 messages🔥):
Massive Image Data Hosting Discussion:
- Members @yorth_night, @.wooser, @benxh, @crainmaker, and @tsunemoto discuss options to store and handle terabytes of image data from midjourney. Initial options mentioned are Amazon S3 and local storage, but they conclude that costs and capacity are roadblocks.
- @yorth_night indicates the data amount could potentially be dozens of terabytes, consisting of ten million images with their prompts.
- @benxh suggests using Hugging Face as a platform, as it allows datasets to be streamed, eliminating the need for disk space.
- @crainmaker confirms that Hugging Face offers a per-file limit of 50GB with no overall limit, according to a discussion post on the Hugging Face forum.
- @tsunemoto and @crainmaker discuss creating smaller batches of parquet, considering MD5 hash instead of full image data, but acknowledge that storage for a large image dataset is still required.
- @benxh recommends using the huggingfacehub Python implementation to push each parquet file as soon as it's saved.
- The participants ultimately decide to go with Hugging Face due to its free storage and high file size limit. However, @crainmaker notes that relying on one platform can pose a risk of a single point of failure.
- @pradeep1148 shares a YouTube video link, but its relevance or context is not discussed.
Nous Research AI Channel: interesting-links (7 messages🔥):
AI and Music Transformation:
- **Transforming Singing into Orchestral**: @yorth_night and @.wooser discussed about the potential of AI in music creation and transformation, particularly inspired by Google's project. A link to the [related blog post](https://deepmind.google/discover/blog/transforming-the-future-of-music-creation/) was shared. @ldj shared a [YouTube video](https://youtu.be/rrk1t_h2iSQ?si=njkk-ajNonaiTum4) showcasing the technology.
- **Dreams of Open Source**: Both @yorth_night and @.wooser expressed disappointment that this cutting-edge technology isn't open source currently and hope to see it in the future.
Skeleton of Thought and Tree of Thought:
- **Skeleton and Tree of Thought**: @georgejrjrjr referenced the "Skeleton of Thought" paper. @yorth_night shared a link about a similar concept called the [Tree of thought for dataset generation](https://vxtwitter.com/migtissera/status/1725028677124235288).
AI Game Playing:
- **Pixel-Level AI Game Playing**: @f3l1p3_lv presented an [old Python project](https://www.youtube.com/watch?v=eQC1JGMIxU0) where a neural network is used to play a game by seeing every pixel of the game window.
Text-to-Video model from Meta:
- Meta's New Text-to-Video Model: @tsunemoto posted links to new research from Meta on Text-to-Video models. Further discussion on this work was had by @teknium and @qasb, with a comparison made to the process of 'animate-diff'.
Datasets Link:
- **Sinthia-v1.3 Dataset**: @teknium shared a link to the [Sinthia-v1.3 dataset on Huggingface](https://huggingface.co/datasets/migtissera/Synthia-v1.3).
Nous Research AI Channel: announcements (7 messages🔥):
OpenHermes 2.5 Ranking on HF Leaderboard:
- @teknium excitedly announced that OpenHermes 2.5 has achieved a high ranking on the HF Leaderboard, securing a second-place position in the 7B models category.
Nous Research AI Channel: bots (7 messages🔥):
Probabilistic Challenges and Feedback on Claude v2:
-
Probability Questions: @f3l1p3_lv posed several probability challenges:
- In a classroom with a certain number of students split by gender and eyeglasses usage, what's the probability that a random student will be a woman who doesn't wear glasses? The options given were A) 40%, B) 12%, C) 60% and D) 16%.
- Two non-biased dice are thrown, and both numbers are odd. What's the probability their sum will be 8? The options given were A) 2/36, B) 1/6, C) 2/9, D) 1/4, and E) 2/18.
- In a group of multilingual men and women, what's the probability that a randomly chosen French speaker will be a man? The options given were A) 47/99, B) 35/68, C) 92/193, and D) 52/99.
- If a dice is thrown twice, what's the probability that the first throw will result in a 3, given that the sum of the two throws equals 7? The options given were A) 2/6, B) 1/6, C) 1/2, and D) 3/7.
-
Claude v2 Feedback: @f3l1p3_lv commended the Claude v2 model, expressing that it was "the best".
Nous Research AI Channel: general (7 messages🔥):
AI Music Generation Discussion:
- @yorth_night shared Suno.ai's music generation capabilities and how even a seasoned musician friend couldn't tell the music was AI-generated. He also praised Suno's instrumentals, considering them perfect with occasional minor issues in the vocals.
- @wooser asked how Suno.ai creates its music, with @yorth_night suggesting Bark could be the base model for voice generation.
Fine-tuning and Training Models:
- @cue asked for help to fully finetune LLama 2 70B on 8x A100 GPUs. He mentioned trying with axolotl, Huggingface scripts, accelerate & deepspeed but experienced Out of Memory (OOM) errors. He also mentioned a blog post on Huggingface about this (https://huggingface.co/blog/ram-efficient-pytorch-fsdp) but was unsuccessful in replicating the methods.
- He got responses from @wooser suggesting to check with the Axolotl discord and mention any specific OOM errors, and discussing possible issues due to Docker setups.
- @teknium suggested that 16x 80gb gpus on multiple nodes are required for a full fine-tune of a 70b model, while @tokenbender joked about the possibility of finetuning with deepspeed taking forever.
- @euclaise mentioned that LoRA is equivalent to full finetuning if a sufficiently large rank was selected and the embedding layers were also finetuned.
- Despite this, @teknium mentioned observations by another user that a high rank was doing worse than both full finetuning and a low rank, pointing out that full rank LoRA might not be equivalent to full finetuning after all.
Nous Capybara 34B API and Playground:
- @alexatallah introduced a new Capybara 34B API and playground (https://openrouter.ai/models/nousresearch/nous-capybara-34b).
- @gabriel_syme asked about the number of models expected to be found in the new API and playground.
Organization of Nous Research:
- @f3l1p3_lv asked about the leadership structure of Nous Research, which @teknium clarified by stating that there are four founders of the organization, including himself.
AI Models Training Resources:
- @00brad requested guidance on training models to optimize for structured output. @teknium advised him to look into Axolotl.
GPT Slops:
- @alpindale urged others to contribute to "gptslop" on his GitHub (https://github.com/AlpinDale/gptslop).
- @giftedgummybee suggested to @alpindale to refer to Eric's list for GPT4 filtering dataset for finding the slops, and if necessary, he could request more dataset from @teknium. However, @teknium clarified that his dataset was mostly about alignment slops, not prose slops.
Nous Research AI Channel: ask-about-llms (7 messages🔥):
AI Development and Use-Cases on LLMs and Non-Roman Language Training:
- LLMs and Rust Code Analysis: @ac1dbyte shared their experience using Hermes AI to conduct meticulous, vulnerability-focused reviews on Rust programming language codes and further discussed their efforts in semi-normalizing the generated reports to JSON for better integrations, such as with MongoDB. They also shared their base prompt for 'Bytecode AI'.
- Continued Pre-training vs. Full Finetuned: @.wooser and @teknium discussed the differences between continued pre-training and a full finetune on LLMs, essentially concluding that the difference lies in the size of the dataset. @.wooser humorously described the process as "changing the numbers by showing it new stuff".
- Training LLMs on Non-Roman Languages: The users also discussed the training of LLMs on non-Roman languages, with emphasis on Japanese. @.wooser mentioned that minimum of 40b tokens and a full finetune are generally required for such training, while a new tokenizer might also be needed. They also shared a link from Hugging Face's NLP course that provides a guide on how to train a tokenizer.
- Application of LLMs in Fiction Text Data: @.wooser discussed their aim to use a large Japanese fiction text data to train LLM into resembling NovelAI or the older versions of AI Dungeon. They presented their current approach (with the instruction 'Finish the following section of text' repeatedly in the prompt) and sought advice on whether this could potentially affect the AI's function. They also asked if there were better approaches to achieve their aim.
- Perceptions and Fear of AI: The discussions touched on the widespread fear and misunderstandings related to AI, as mentioned by @teknium and @.wooser. They both mentioned experiences of people being scared due to risk to jobs and issues like deepfakes.
- Other LLM models: @f3l1p3_lv asked whether LLMs could use models other than the Transformer neural entanglement model.
Relevant Links: - Hugging Face's NLP course on how to train a tokenizer.
Nous Research AI Channel: memes (7 messages🔥):
AI and Memes Discussion:
- @.wooser reacted to a link shared by @_automagic from Y Combinator News titled "Artificial Intelligence can't deal with Chaos", expressing that chaos is a perfect field for them.
Guild: Alignment Lab AI
Alignment Lab AI Guild Summary
-
In-depth discussion on Secure Learning Technologies such as Federated Learning, Homomorphic Encryption, and Trusted Execution Environment (TEE), shedding light on their limitations and potential vulnerabilities, with @rusch sharing insightful observations on each.
- Quote: "@rusch: FL and homomorphic encryption both have weird performance problems, so I haven’t really seen them exceed outside of niche use cases."
- Quote: "@rusch: Meanwhile, TEEs are like a 10% overhead for basically using the same code and approach. Granted more vulnerable to side channels, so don’t use that with Mossad or the NSA as your counterparty."
-
Commencement of a conversation regarding the development of non-English versions of Language Models (LLMs) initiated by @nanobitz.
-
Deep dive into the potential evaluation challenges and improving the performance of chatbot models like OpenChat and Codellama: the examination of overfitting in fine-tunes, exploration of hallucinations in GPT4, and efforts to filter out common phrases from training using a shared GitHub repository.
- Quotes:
- "@imonenext: That might be why fine-tunes seem overfitted"
- "@imonenext: BTW it's hallucinations 😅 <@748528982034612226> Your GPT4 is hallucinating a lot"
- "@alpindale: Created this repository to gather all the common phrases generated by GPT and Claude models, so we can easily filter them out from training datasets."
- Link: @alpindale's github repository
- Quotes:
-
Ongoing dialogue on the categorization of different models, with the acknowledgment of Codellama as a base model, presenting potential implications for its application and optimization in the future.
- Quotes:
- "@teknium: Codellama considered base?"
- "@imonenext: Yea"
- Quotes:
Alignment Lab AI Channel Summaries
Alignment Lab AI Channel: ai-and-ml-discussion (3 messages):
Discussion on Secure Learning Technologies:
- Federated Learning (FL) and Homomorphic Encryption: @rusch stated, "FL and homomorphic encryption both have weird performance problems, so I haven’t really seen them exceed outside of niche use cases."
- Secure Enclaves and Trusted Execution Environment (TEE): @rusch showed favor toward secure enclaves like H100's TEE support for private learning due to their efficiency, "Meanwhile, TEEs are like a 10% overhead for basically using the same code and approach." However, they also warned about potential vulnerability to side-channel attacks, "Granted more vulnerable to side channels, so don’t use that with Mossad or the NSA as your counterparty."
Alignment Lab AI Channel: looking-for-collabs (3 messages):
Non-English LLM Discussions:
- @nanobitz initiated a discussion regarding non-English versions of Language Model (LLM). Other details, quotes or links regarding this topic were not provided.
Alignment Lab AI Channel: oo (3 messages):
Chatbot Model Evaluation and Development:
-
OpenChat 3.5 Hungarian Exam: @imonenext asked if anyone was interested in grading the openchat 3.5 Hungarian exam, which was generated by a few-shot template and attained a 100% score with GPT4 grading.
- Quote: "@imonenext: Anyone interested in hand-grading the openchat 3.5 hungarian exam?"
- Quote: "@imonenext: <@748528982034612226> generated it by few-shot template"
- Quote: "@imonenext: and got 100% score with GPT4 grading"
-
Fine-tunes overfitting: @imonenext mentioned that fine-tunes might seem overfitted because they were evaluated 0-shot while base models were evaluated 5-shot in the original repo.
- Quote: "@imonenext: That might be why fine-tunes seem overfitted"
-
Hallucinations in GPT4: @imonenext pointed out that "GPT4 is hallucinating a lot, even worse than chat mode".
- Quote: "@imonenext: BTW it's hallucinations 😅 <@748528982034612226> Your GPT4 is hallucinating a lot"
- Quote: "@imonenext: Worse than chat mode"
-
GPT and Claude Models Phrases Repository: @alpindale shared a GitHub repository created to gather all the common phrases generated by GPT and Claude models to filter them out from training datasets.
- Quote: "@alpindale: Created this repository to gather all the common phrases generated by GPT and Claude models, so we can easily filter them out from training datasets."
- Link: @alpindale's github repository
-
Codellama considered as base?: @teknium questioned whether Codellama is considered a base model, with @imonenext confirming.
- Quote: "@teknium: Codellama considered base?"
- Quote: "@imonenext: Yea"
Guild: Skunkworks AI
Skunkworks AI Guild Summary
- Discussion on GPT4V's multiple image support in the bakklava-1 channel:
- Query on how GPT4V supports multiple images, with speculation that the model might compare projections like vectors as suggested by @occupying_mars.
- Idea proposed by @far_el that GPT4V likely gathers embeddings from each image which are then used by the model.
- Further questions raised by @occupying_mars on if and how transformers, like GPT4V, can understand state changes between two images.
- Link to a YouTube video shared by @pradeep1148 in the off-topic channel, with no specific context or discussion supplied.
Skunkworks AI Channel Summaries
Skunkworks AI Channel: general (3 messages):
Unfortunately, there is no substantial content in the provided message for a meaningful summary. The user @oleegg simply greeted the Skunkworks community with "gm" (short for good morning).
Skunkworks AI Channel: off-topic (3 messages):
Video Link Shared by Pradeep1148:
- @pradeep1148 shared a link to this YouTube video. No specific context or discussion was provided around this post.
Skunkworks AI Channel: bakklava-1 (3 messages):
Discussion on GPT4V's Multiple Image Support
- How GPT4V Supports Multiple Images: @occupying_mars questioned how GPT4V supports multiple images and speculated that the model might compare the projections like vectors.
- The Role of Embeddings: @far_el suggested that GPT4V likely captures embeddings from each image and passes them to the model.
- Understanding State Change between Two Images: @occupying_mars pointed out that this approach doesn't explain the coherent logic that GPT4V seems to apply between two images and speculated further on the capabilities of transformers in understanding state changes.
## Guild: [MLOps @Chipro](https://discord.com/channels/814557108065534033)
### MLOps @Chipro Guild Summary
Only 1 channel had activity, so no need to summarize...
**MLOps @Chipro Channel Summaries**
### MLOps @Chipro Channel: [events](https://discord.com/channels/814557108065534033/869270934773727272) (1 messages):
"TechBio Mixer Events by Valence Labs": - @jonnyhsu announced two TechBio mixer events hosted by Valence Labs on November 22nd at Oxford and November 23rd at Cambridge. The Oxford event is co-hosted with Michael Bronstein. Links: - Oxford Event: https://lu.ma/7yzlzoi8 - Cambridge Event: https://lu.ma/4k9s0rsa
"Crypto Fintech and Real-Time Data Infrastructure": - @jovana0450 shared an online event focusing on real-time data infrastructure in crypto fintech applications. The event is set to take place on Nov 16, 2023, featuring speakers from Goldsky, Alchemy, Superchain Network, and RisingWave Labs. The link to register: https://www.meetup.com/streaming-stories/events/297191788/
The Ontocord (MDEL discord) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The AI Engineer Foundation Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The Perplexity AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
Guild: YAIG (a16z Infra)
YAIG (a16z Infra) Guild Summary
- In-depth discussion on the concept of a Data Intelligence Platform, based on a shared link to a Databricks blog post by @stevekamman.
- Conversation surrounding the recent Discord Outage centered on a blog post detailing the event, shared by @zorkian.
YAIG (a16z Infra) Channel Summaries
YAIG (a16z Infra) Channel: ai-ml (2 messages):
Databricks Data Intelligence Platform Discussion:
- Data Intelligence Platform Overview: @stevekamman shared a link to a Databricks blog post that gives an in-depth explanation of what a data intelligence platform is.
- Links:
- Databricks Blog Post
YAIG (a16z Infra) Channel: tech-discussion (2 messages):
Discord Outage Report:
- Blog Post on Discord's Outage: @zorkian shared a blog post detailing Discord's nearly one hour-long outage.