[AINews] AI Discords Newsletter 11/29/2023
This is AI News! an MVP of a service that goes thru all AI discords/Twitters/reddits and summarizes what people are talking about, so that you can keep up without the fatigue. Signing up here opts you in to the real thing when we launch it 🔜
- Latent Space Discord Summary
- OpenAI Discord Summary
- LangChain AI Discord Summary
- Nous Research AI Discord Summary
- Alignment Lab AI Discord Summary
- Skunkworks AI Discord Summary
- LLM Perf Enthusiasts AI Discord Summary
- ▷ Channel: general (6 messages):
- ▷ Channel: gpt4 (8 messages):
- ▷ Channel: finetuning (3 messages):
- ▷ Channel: opensource (4 messages):
- ▷ Channel: collaboration (1 messages):
- ▷ Channel: speed (1 messages):
- ▷ Channel: feedback-meta (1 messages):
- ▷ Channel: openai (2 messages):
- ▷ Channel: prompting (8 messages):
- MLOps @Chipro Discord Summary
- Perplexity AI Discord Summary
Latent Space Discord Summary
- The development and discussion of an AI search engine tool that uses pgvector and tree-sitter, shared by
@estebanvargas32
for feedback: link to his project. - Conversation revolving around the use of GPT for better prompt engineering, highlighted by
@jozexotic
's interest in a Discord summarizer and@slono
's sharing of useful GPT agents. @guardiang
's creation of a GPT specifically for handling prompts, with a demo link shared for user testing.- Announcement and commencement of a discussion on RAG context management by
@swyxio
and@315351812821745669
: discussion link. - Release of a new podcast episode announced by
@swyxio
, available for listeners on Twitter and HN: podcast link. - The share by
@swyxio
of an essential papers selection recommended by Jason Wei for thorough foundational understanding of several AI aspects. - Unclear note about the Chain of Note paper club notes by
@swyxio
, though no specific details were provided. - Questions raised by
@yikesawjeez
about the context reversal paper and@zf0
about the list of already covered papers, with both inquiries left unanswered.
Latent Space Channel Summaries
▷ Channel: ai-general-chat (9 messages):
- AI Search Engine Discussion:
@eugeneyan
,@willow_ghost
and@estebanvargas32
discussed about a search engine tool built by@estebanvargas32
using pgvector and tree-sitter. He shared the link to his project for feedback. - Useful GPT for Better Prompt Engineering Discussion:
@jozexotic
queried about useful GPT tools for better prompt engineering. He also mentioned his interest in Discord summarizer. - Sharing Useful GPT Agents:
@slono
shared links to the GPT agents he found useful. He also mentioned an agent which could look up things in its knowledge base upon explicit request but criticized it for being slow. - Prompt Handling GPT:
@guardiang
put together a GPT for handling prompts. He shared the link for users to try it out.
▷ Channel: ai-event-announcements (2 messages):
- RAG Context Management:
@swyxio
announced a discussion start in 15 minutes, featuring@315351812821745669
on the subject of RAG context management. The meeting could be joined via this link.
- New Podcast Episode:
@swyxio
shared the news of a new podcast episode live on Twitter and HN. The podcast could be accessed here.
▷ Channel: llm-paper-club (4 messages):
- Essential Papers Selection: User
@swyxio
shared a selection of essential papers recommended by Jason Wei: - GPT-3 paper - Chain-of-thought prompting - Scaling laws - Emergent abilities - Language models can follow both flipped labels and semantically-unrelated labels - Paper Club Notes:
@swyxio
noted about the Chain of Note paper club notes, however, no specific link or details were provided. - Contextual Reversal Paper Inquiry:
@yikesawjeez
asked about the title of the context reversal paper, but no response was provided. - List of Covered Papers:
@zf0
inquired if there is a list of already covered papers, implying the existing spreadsheet might be outdated. No response was provided.
OpenAI Discord Summary
- Discrepancies in the performance of ChatGPT/GPT-4 design models prompted a variety of user reactions. Some users were concerned about perceived declines in ChatGPT's performance, whilst others defended its capabilities. There were specifically discussions about employing GPT for tasks such as converting video interviews to written form or using it for rewriting in specific tones.
- "I've been trying for several hours now to get my bot to handle coding tasks but it keeps timing out." –
@hankliss
(ai-discussions) - "GPT-4 Turbo is an enhanced version of the original GPT-4, which handles large data volumes better." –@lumirix
(openai-chatter) - Discussions arose about the differences between GPT-4 and GPT-4 Turbo, with a few users citing low performance output with GPT-4 Turbo. Users also discussed GPT Plus subscriptions and its usage limits, alongside the potential for academic misconduct if GPT is used in academic settings.
- "I feel like GPT-3.5 was better than GPT-4 in solving coding issues." –
@loschess
(ai-discussions) - "I've had slower response times on all three browsers I've tried…would love some insight." –@sl0vius
(openai-chatter) - Reported issues with the ChatGPT platform covered processing issues, saving issues, and problems with VPNs affecting ChatGPT usage. There were also queries about integrating GPT with platforms like Instagram and Whatsapp, as well as uploading documents to ChatGPT.
- "I'm trying to save my new GPTs but keep getting an error." –
@jacquelynyakira
(openai-questions) - "Why can't I use ChatGPT on my desktop browsers, but I can on mobile?" –@apelambo
(openai-questions) - Users shared their experiences with customizing ChatGPT, and some offered their custom GPTs for feedback and review. They also discussed the limitations of languages models, especially when handling complex symbols.
- "My GPT doesn't seem to refer to the source documents I provide and instead uses its default knowledge." –
@pietman
(gpt-4-discussions) - "If understanding how GPT-3 interprets symbols like {}, [], and () could solve a lot of issues." –@mysticmarks1
(prompt-engineering) - Discussions revolved around the use of APIs and prompts in relation to GPT. They discussed the potential of creating a GPT for prompt generation, dealing with prompt issues, and how to identify the GPT-4 model version.
- "I was wondering if I could use my GPT in my own personal projects through an API." –
@ionknowu
(gpt-4-discussions) - "Has anyone figured out how we can identify and sort GPT model versions?" –@hoodiewoody
(prompt-engineering)
OpenAI Channel Summaries
▷ Channel: ai-discussions (109 messages):
- Perceived Degradation of ChatGPT's Performance: Several users, including
@jaade77
,@johnpringle
,@loschess
,@simple_chad
, and@hankliss
, expressed concern over perceived decrease in ChatGPT's performance and capabilities. Some cite difficulties in the bot's handling of coding tasks, timing out after a certain number of messages, and issues with its tone adjustment capabilities. - Defending ChatGPT's Performance: Users like
@n8programs
and@themandalorian
defend ChatGPT, stating that its performance has not deteriorated, but people are discovering more of its limitations or perhaps not using the tool correctly. - Comparison Between GPT-4 and GPT-3.5:
@loschess
shared an experience where GPT-3.5 reportedly performed better than GPT-4 in solving a coding issue. - Employment of GPT for Specific Tasks: Various discussions about employing GPT for tasks such as converting a video interview to a written form by
@jaade77
, using GPT to rewrite for tone by@jaade77
, and using it to write code to combine text files by@themandalorian
. - Queries for Assistance and Interest in AI: Several users such as
@martinr_33972
,@healer9071
, and@ankur1900
posted links to the bots they've created, sought assistance with the huggingface.co platform, or called for collaboration in creating an efficient solution for a coding challenge.
▷ Channel: openai-chatter (211 messages🔥):
- GPT-4 vs. GPT-4 Turbo: Users discussed the difference between GPT-4 and GPT-4 Turbo.
@lumirix
explained that GPT-4 Turbo is the enhanced version of GPT-4, while@captainsonic
and@n8programs
affirmed Turbo's improved performance and cost-effectiveness. There was a debate about whether the latest GPT-4 Turbo version was a downgrade, with users like@emilybear
,@nexor7
, and@g0lo
noticing a poor output. User@.pythagoras
pointed out that the Turbo version trades off detailed comprehension of extensive prompts in favor of managing large volumes of information due to hardware constraints.
- GPT Plus Subscriptions and User Accounts:
@hahalol
and@iamsivart
checked on the status of GPT Plus subscriptions, with the others confirming the current waitlist.@xannaeh
raised a concern regarding using a company credit card to pay for multiple employee accounts, to which@elektronisade
responded by sharing an OpenAI help article regarding declined credit card issues. It was further mentioned that company-wide use of the same account is not feasible, nor is self-hosting of the ChatGPT system.
- Usage Limits and Loading Speeds: Users complained about slower loading speeds and the system's performance during peak hours.
@lk_jinxed
found GPT 3.5 to be laggy, while@sl0vius
experienced slow response times on three different web browsers.@7877
responded that this delay is normal during peak load periods and recommended waiting for smoother performance.
- Platform's Future and GPT Store: Users are waiting for updates on the GPT store, with
@dydzio
specifically asking for new information. Although no new information is available yet,@eskcanta
and@solbus
speculated on how the store's launch might affect current user usage and the management of the Plus subscription waitlist.
- AI-driven Writing and Use in Academia:
@lefoudurex
sought help with a university essay but was cautioned by@dabuscusman
about the potential academic misconduct risk associated with using AI writing assistance in an academic setting.@offline
recommended a bit-by-bit approach when using AI for rephrasing large documents.
▷ Channel: openai-questions (111 messages):
- ChatGPT Issues: Many users reported various issues with the ChatGPT platform. For instance,
@mehmettrkmen
experienced an issue with analysing files,@enryurii
encountered issues with image processing and@jacquelynyakira
couldn't save her customGPTs.@solbus
and@satanhashtag
tried to assist with solutions and workarounds for these issues. - Integrating GPT with Other Platforms:
@nurik7086
inquired about how to integrate GPT with Instagram and WhatsApp. No solutions were offered in this message rundown. - GPT Discussions: Various discussions on the workings of GPTs took place.
@ankur1900
sought help for a problem of matching user-input sentences with the dataset using Assistant API. Meanwhile,@quacky.games
explored the possibility of creating chats with multiple roles, and@wilhaim
questioned GPT's benefits. - VPN and Browser Issues:
@apelambo
reported issues with using GPT on desktop browsers but not on mobile, which resulted in a discussion of VPNs and their potential impact on ChatGPT use. It was discovered that even an inactive VPN (@apelambo
uses NordVPN) could cause issues. - Uploading Documents to ChatGPT: Users
@dracount
,@samuleshuges
and@bazzingabalcony
discussed the possibility and process of uploading documents to ChatGPT for rewriting or input analysis.
▷ Channel: gpt-4-discussions (86 messages):
- ChatGPT customizations: Many users shared their experiences customizing ChatGPT, including the use of symbolic placeholders (
@docwobble
), APIs and integrating with Google Docs (@loschess
&@woodenrobot
), and instructing the model to provide summaries of varying lengths (@duckbow
). - GPT-4 issues: Several users expressed frustrations with GPT-4. Discussion ranged from inconsistency in behavior, such as not referring to provided sources over its default knowledge (
@pietman
), to network errors and unresponsive states across different versions (@helloaang
,@XXMINECRAFTGODXX
,@strangeknoll
,@pietman
,@mirel
). - Sharing custom GPTs: Users shared their custom GPTs for feedback and review.
@martinr_33972
invited potential investors and software developers to check out his GPTs, and@drkuberansrinivasan_02977
was trying to find how to view comments on his shared GPTs.@ajlbs
unveiled his GPTs based on spirituality and wisdom of the Bhagavad Gita. - Seeking help: Various users asked for advice or troubleshooting help, including
@jrvra
looking to make a GPT for reviewing journal papers,@_vincent32
wanting a professional email marketing GPT,@mom_h8_my_guns__
needing help fine-tuning a personal GPT, and@wilhaime
curious about the benefits of GPTs. - API and Plug-ins:
@ionknowu
asked if they could use their custom GPT for a personal project through an API, but@elektronisade
clarified that OpenAI ChatGPT is isolated from API access, and the Assistant API must be used instead.@Lo Mein
was interested in integrating plugin functions into custom GPTs, and@woodenrobot
provided a link to the documentation and confirmed there are useful posts on the forums for further assistance.
▷ Channel: prompt-engineering (22 messages):
- Understanding Complex Symbols: User
@mysticmarks1
highlighted the utility of understanding how GPT-3 interprets symbols such as{}
,[]
, and()
in different contexts like programming, mathematics, and textual roleplay. This understanding can facilitate better system mixed-code natural language formats.
- Stable-Diffusion Prompt Crafter:
@korner83
built a GPT for prompts generation named "Stable-Diffusion Prompt Crafter". The tool also supports weighted prompts and wildcards. The Reddit thread can be found here.
- Discussion on Sharing Links:
@madame_architect
asked if sharing external links, such as to arxiv or Reddit, is now acceptable. User@eskcanta
responded by explaining that while it can still be considered risky, linking to key articles and discussing the main points can invite more interaction.
- Challenges with Interpretation of Transcripts: User
@greenysmac
is facing issues with prompts against a pair of transcripts with three speakers each, which they've added to GPT-3 and ChatGPT. The AI is inconsistent with the number of speakers it detects. To troubleshoot,@solbus
suggested that this could be due to limitations in LLMs' ability to perform simple counting tasks without external tools.
- Identifying GPT-4 Model Version:
@hoodiewoody
inquired about how to identify the version of their GPT-4 model. User@fran9000
suggested referring to the search box in a specific channel to filter the list of GPTs.
▷ Channel: api-discussions (22 messages):
- Understanding Symbols in Different Contexts:
@mysticmarks1
shared a detailed explanation on how GPT-3 interprets the symbols[]
,{}
, and()
in various contexts – programming, mathematics, and text annotation or roleplay. This was found to be extremely useful information by the channel members. - Creating a GPT for Prompt Generation:
@korner83
made a GPT for prompt generation that supports weighted prompts and can use wildcards. He shared the link to the GPT and invited others to give it a try. - Issues with Comments Visibility on Shared GPT:
@drkuberansrinivasan_02977
expressed a concern about not being able to see comments on the GPTs that he shared publicly. - Sharing ArXiv Links:
@madame_architect
asked if it's allowed to share ArXiv links on the channel.@eskcanta
informed him that though it's still uncertain, significant segments from such articles can be shared for discussion.@eskcanta
stressed the importance of seeking review and help from the moderators if faced with a significant hit from AutoMod. - Prompt Testing Issues:
@greenysmac
shared issues he was experiencing when testing prompts using ChatGPT in Playground on current GPT4, Poe on GPT4 and a GPT build. He noted inconsistencies with the GPT recognizing and recalling the number of speakers in a given transcript.@solbus
speculated that this could be due to language models' inability to perform mathematical tasks, like counting, reliably.
LangChain AI Discord Summary
- Discussion centered around Chatbot Troubleshooting, including code and environment variables for a vector database tool and agent execution. User
a404.eth
had issues beyond basic greetings with their chatbot, sharing Python code and environment configuration for peer advice. - Query on integrating OpenAI GPT-4 with RAG was raised by
jupiter.io
, specifically on modifying the weighting affecting RAG's data selection.haste171
recommended modifyingsearch_k
in the vector retriever's query kwargs as a possible solution. wolfwood1862
encountered missing import issues, unable to importApifyWrapper
fromlangchain
. To resolve,haste171
suggested referring to the API reference or updating the LangChain package.- Inquiry on AI Chatbot Data Handling for multiple input types (text, image, etc.) was raised by
sid.pocketmail
. This included interest in a GPT-4-vision API for a chatbot and handling image retention for related user questions. - Series of miscellaneous questions raised, including running "OpenAIAssistantRunnable" as an agent (
katerinaptrv
), acquiring the client_id for Confluence (._.nobody._._
), and resolving 503 responses with Azure OpenAI (_pabloe
). - Miscellaneous notes on AI and project updates from
claragrey
,jupiter.io
, andfran.abenza
. @veryboldbagel
discussed Langchain Experimental Features, including agents for handling panda, CSV, and SQL data.@steve675
expressed data storage concern, as their data resides in a different database, the Kustoclient.- Tips on input keys setup for better functioning by
@liminalstvte
and subsequent technical issues related toqaTemplate
and usage ofprompt
insideqchainoption{}
by@menny9762
. - A shared project named Seesaurus, a 3D visualization tool for English words, showcased by
@recurshawn
. Seesaurus allows users to visualize word relations in a 3D space: Seesaurus - A quote from Sam Altman on the perspective of technological growth shared by
@rpall_67097
: "When you are standing on the exponential curve of technology, it looks flat behind you and vertical in front of you, but it's just a curve."
LangChain AI Channel Summaries
▷ Channel: general (53 messages):
- Chatbot Troubleshooting:
a404.eth
shared an issue they experienced with their chatbot where it worked fine with simple greetings but faced problems when trying to use a specific "tool". They then shared a large block of Python code along with its relevant imports which define functions for creating a vector database tool, parsing source documents, and executing the agent. They also shared additional information about the environment variables being used for the configuration of chatbot. - OpenAI GPT-4 Integration with RAG:
jupiter.io
asked about the possibility of controlling a weighting that affects how much RAG chooses of its original data compared to the custom data.haste171
suggested that search_k in vector retriever's query kwargs can be modified but was not completely sure. They further clarified how setting a different temperature value impacts the generativity of the AI. - Missing Import Issues:
wolfwood1862
encountered errors when trying to import certain modules (specifically,ApifyWrapper
) fromlangchain
.haste171
advised referring to the API reference or updating the LangChain package to resolve these issues. - AI Chatbot Data Handling:
sid.pocketmail
inquired about an open-source project similar to ChatGPT, which handles multiple inputs like text, image, etc., with agents in production. In a similar thread, he expressed interest in implementing the GPT-4-vision API in his chatbot and asked how to retain an image in memory so that users could pose related questions. - Miscellaneous Questions: Several other specific queries were made during the conversation, including
katerinaptrv
's query about running "OpenAIAssistantRunnable" as an agent using the "code_interpreter" tool of OpenAI,._.nobody._._
's query about getting the client_id for Confluence, and_pabloe
's concern about encountering 503 responses while using Azure OpenAI. - Miscellaneous Notes:
claragrey
noted that they have deployed an endpoint using LLM and LangChain’s ConversationChain,jupiter.io
was interested in AIs that write by themselves, andfran.abenza
expressed frustration at being on a waiting list for months.
▷ Channel: langchain-templates (7 messages):
- Langchain Experimental Features:
@veryboldbagel
mentioned that Langchain experimental offers agents for handling pandas, csv, and SQL data. - Data Storage Concern:
@steve675
indicated that their data is stored in a different database, specifically Kustoclient. - Input Keys:
@liminalstvte
advised setting input keys in the prompt for better functioning. - Technical Issues:
@menny9762
reported that they encountered a problem which was not resolved even after setting input keys. Later, they discovered that theqaTemplate
is deprecated in javascript. As a result, they switched to usingprompt
insideqchainoption{}
which resolved the issue.
▷ Channel: share-your-work (1 messages):
- Seesaurus - A 3D Visualization of English Words:
@recurshawn
shared a project they developed called Seesaurus. It's a 3D visualization of English words where users can see how words relate to each other in a 3D space, add words, and form clusters of words. This project was made with intentions of seeing how language models perceive our words. Supports all devices but large screen is recommended.
- Seesaurus link: https://seesaurus.com/ - Hacker News launch link: https://news.ycombinator.com/shownew
▷ Channel: tutorials (1 messages):
- Perspective on Technological Growth:
@rpall_67097
shared a quote from Sam Altman about the exponential curve of technology growth: "When you are standing on the exponential curve of technology, it looks flat behind you and vertical in front of you, but it's just a curve."
Nous Research AI Discord Summary
- Extensive discussion on the projected size and training process of GPT-4. Users guessed that OpenAI likely had 5T or 6T unique tokens, with best results achieved with 2 epochs for text, and 4 epochs for code-based text.
- Hypotheses regarding the structure of GPT-3.5 Turbo and GPT-4 Turbo, where it was suggested they may have a smaller structure than expected and might use sparser attention mechanisms. Specific figures of below 80 billion and even 50 billion parameters were discussed for Turbo.
- Comparative test results shared for evaluating performance limitations of
Claude
andGPT-4 Turbo
models. A link was shared to Greg Kamradt's Twitter with the Test Results. - The differences between data precision types fp32 and bf16 probed, with a link shared to a LinkedIn article for further understanding.
- Mention of the Chinese
TigerBot
model fromTigerResearch
available onhuggingface
. Detailed information was unclear due to language constraints. - Memory management and initial delays in loading models were explored. It was clarified that mmap doesn't necessarily increase loading speed but decreases initial delays before prompting.
- Limits and challenges associated with GPT model size, generation speed, and vRAM usage were discussed and considered in relation to
HuggingFace
implementation. - Announcements and discussions about new large language models, especially focusing on Chinese LLMs (Yuan 2.0), its unique size (2b, 51b, 102b), and license terms.
- Questions raised about Yuan 2.0's benchmarking methods, specifically the use of translating and rephrasing questions, with suspicions of potential information leaks.
- Release of Perplexity.ai online LLMS and their pricing details shared by user
@atgctg
with links to the official blog post and pricing page. - Estimation of AI Training costs, with a comparison of hourly rates on various platforms like
h100
,runpod
,AWS
,GCP
and consideration ofSXM5
integration. - Sharing of resources like Oasis Corpus dataset, github repository for prompt lookup decoding and interesting tweets on experimentation and knowledge distillation.
- Announcement about OpenHermes 2.5 inclusion in LMSys' ChatBot Arena for testing and comparison, with an invitation to test various models blind via chat.lmsys.org.
Nous Research AI Channel Summaries
▷ Channel: benchmarks-log (1 messages):
Since only a single partial message from user @qnguyen3
is provided with no substantive content, there isn't enough context to provide a summary. Moreover, no relevant topics, user interactinns or discussions were identifiied in the message history.
▷ Channel: interesting-links (7 messages):
- Cost of AI Training:
@nonameusr
expressed surprise at the $2 hourly rate for the h100.@coffeebean6887
explained that this might be a starting rate, likely increasing for shorter reservations, but considered competitive, especially given the SXM5 integration. They compared it to higher rates found on other platforms like runpod, AWS, and GCP. - Tweet on Experimentation:
@metaldragon01
shared a tweet from @SebastienBubeck on experimentation. - NLP Dataset:
@euclaise
shared a link to the Oasis Corpus dataset on Huggingface. - Tweet on Knowledge Distillation:
@atgctg
shared a tweet from @eugeneyan that discusses knowledge distillation. - Prompt Lookup Decoding:
@Fynn
shared a link to a repository that implements speculative decoding by string matching parts of the prompt, which could be effective for input-grounded tasks. - LLM Synthetic Data Blog:
@atgctg
shared a blog post discussing Language Model-Led Movement (LLM) and synthetic data generation.
▷ Channel: announcements (1 messages):
- OpenHermes 2.5 Testing on ChatBot Arena:
@teknium
announced that OpenHermes 2.5 has been included in LMSys' ChatBot Arena for testing and comparison. Users are invited to "[g]o and test out several models and compare them blind to determine who is the best!" The website for testing can be visited at https://chat.lmsys.org/.
▷ Channel: general (165 messages🔥):
- GPT-4 Training Observations: Users
@ldj
and@giftedgummybee
had an insightful discussion about the training process and size of GPT-4 models. They deduced that OpenAI likely had 5T or 6T unique tokens, achieving best results with 2 epochs normal text and 4 epochs code text. However, the exact parameters of recent models weren't clear, leading to mere estimations and assumptions about the size and training tactics. - Exploring Turbo Models:
@ldj
and@giftedgummybee
hypothesized that GPT-3.5 Turbo could be less than 80B or even below 50B. They consideredTurbo
a toy product, offering cheap hosting and as a tool to counter open source projects. Moreover, they speculated the attention mechanism might be sparser in Turbo models. - Evaluation of Claude & GPT-4 Turbo: User
@coffeebean6887
posted test results fromGreg Kamradt
comparing the limitations of Claude and GPT-4 Turbo. They demonstrated different performances and limitations regarding token count and recall behaviors for these models. Test Results Cluade vs GPT-4. - Differences Between Data Precision Types: User
@papr_airplane
asked about the differences between fp32 and bf16 in fine-tuning models.@yorth_night
provided a helpful LinkedIn article explaining these differences. - Tigerbot Discussion: Users mentioned the
TigerBot
model fromTigerResearch
on huggingface, however, the responses were varied and the detailed information wasn't easily accessible due to language constraints.
▷ Channel: ask-about-llms (77 messages):
- Discussion on Memory Management and Loading Models: Users
@variav3030
,@giftedgummybee
,@coffeebean6887
,@teknium
, and@russselm
discussed the memory management and loading models. Particularly,@russselm
clarified that mmap doesn't necessarily speed up the model loading process, but instead loads data on demand which helps decrease initial delays before prompting the model. - Concerns about GPT Model Size:
@coffeebean6887
,@teknium
, and@russselm
discussed some theoretical limits and challenges related with GPT model sizes and generation speed, particularly in context ofHuggingFace
implementation and vRAM usage. - Release of Chinese LLMs (Yuan 2.0): Users
@.benxh
,@yorth_night
,@coffeebean6887
, and@teknium
discussed the recent release of Chinese-language large language models, specifically, Yuan 2.0. The model's unusual sizes (2b, 51b, and 102b) and the unique license terms were noted as interesting aspects. - Discussion on Yuan 2.0's Benchmarks:
@coffeebean6887
and@yorth_night
questioned some methods used in Yuan 2.0's benchmarking, specifically, translating and rephrasing of questions. It was suggested that the translation process might lead to information leaks and could potentially invalidate the test. - Release and Pricing of Perplexity.ai Offering: User
@atgctg
shared links to Perplexity.ai's latest blog post introducing their online LLMS and their pricing, but realized that it might have been shared in the wrong channel.
Alignment Lab AI Discord Summary
- Discussion in the AI and ML discussion channel revolved around RAG paragraphs and Hybrid Search.
@adi_kmt
referenced the common standard for RAG, where paragraphs are based on max tokens with an appended subheading for additional context. Also identified was the use of semantic and embedding searches combined in a technique called hybrid search that often involve re-ranking of results with tools like BM25 or SPLADE. - Resource links shared for further delving: - Pinecone Hybrid Search - Production RAG - The General chat channel saw diverse discussions:
-
@magusartstudios
on making some progress with an undisclosed project. -@entropi
shared a DeepMind blog post detailing the use of deep learning to discover millions of new materials. -@altryne
announced a live stream featuring LDJ with a focus on the usage of WandB. Intended for ML beginners, a link to the live stream was made available. - Over in the oo channel,
@lightningralf
provided a link to the Voltage Park site that features a 24,000 h100 cluster which could be of interest to the community members.
Alignment Lab AI Channel Summaries
▷ Channel: ai-and-ml-discussion (1 messages):
- RAG paragraphs and Hybrid Search:
@adi_kmt
discussed the common standard for RAG, noting that paragraphs are generally based on max tokens with a subheading/heading appended for additional context. They also mentioned another common method of using both semantic search and embedding search in a technique called hybrid search, which often involves re-ranking results with tools like BM25 or SPLADE. - Links shared for further reference: - Pinecone Hybrid Search - Production RAG
▷ Channel: general-chat (3 messages):
- Work on a Project:
@magusartstudios
mentioned working on an unspecified project, sharing that they had made some progress. - DeepMind Discovery:
@entropi
shared a link to a DeepMind blog post on the discovery of millions of new materials using deep learning. - Live Stream on Machine Learning Tools:
@altryne
announced a live stream featuring LDJ on how to use WandB. The live stream, aimed at those new to machine learning, is scheduled to be aired on multiple platforms. A link to the live stream was provided. Additionally,@altryne
encouraged TTRPG Discord users to share the live stream in the Discord, and also expressed interest in learning from users with their own WandB setup.
▷ Channel: oo (1 messages):
- Voltage Park Cluster: User
@lightningralf
shared a link to Voltage Park, which features a 24,000 h100 cluster that could be of interest to the channel members.
Skunkworks AI Discord Summary
- User
@benxh
initiated a discussion on investigating Yuan 2.0 in the guild. - Technical question raised by user
@papr_airplane
on the difference between using fp32 and bf16 when finetuning. - Project progress update in the channel
ablateit-wandb-alerts
, with the completion of the Weights & Biases run named autumn-pyramid-122 of the project huggingface by user@ablateit
. Complete run details accessible at wandb.ai website.
Skunkworks AI Channel Summaries
▷ Channel: general (2 messages):
- Yuan 2.0 Inquiry: User
@benxh
asked if anyone was investigating Yuan 2.0. - Finetuning with fp32 vs bf16: User
@papr_airplane
queried about the difference between using fp32 and bf16 when finetuning.
▷ Channel: ablateit-wandb-alerts (1 messages):
- Weights & Biases Run Completion: The Weights & Biases run named autumn-pyramid-122 of the project huggingface by user
@ablateit
has completed. The run details can be found on wandb.ai website.
LLM Perf Enthusiasts AI Discord Summary
- Discussions on the limitations and capabilities of Dedicated Instances were addressed, with
@ampdot
asking about the context limits and@res6969
clarifying the constraints as 8k/32k/128k. - Introduction and feedback on Perplexity.AI's PPLX-Online-LLMs.
@robhaisfield
shared a blog post about the launch of models such as Mistral, codellama 34b, llama 70b, and search-powered models.@thebaghdaddy
affirmed the tool's utility in topic exploration compared to Google search, and elaborated upon the process to@justahvee
. - Natural Language Systems in production discussed:
@res6969
expressed concerns about prompt injection,@nosa_
brought up experiences in PII redaction, and@pantsforbirds
suggested a system of chaining two models for retrieval and redaction.@nosa_
revealed successful fine-tuning of models for PII redaction, while@pantsforbirds
provided opinions on GPT-4's performance in PII removal. - Questions were raised regarding fine-tuning the new
tools
andtool_calling
introduced by OpenAI, and a query about the maximum sample size for 3.5 fine-tuning was clarified by@robertchung
referencing the github openai-cookbook. - Recommendations for open-source image to text model requested by
@jeffreyw128
, with@__polarbear
suggesting the use of native Mac OS OCR. Increasing token limit on llama2 discussed, with progressive summarization as a potential method shared by@joshcho_
. - Job opportunities with cash + equity compensation was mentioned by
@blakeandersonw
in the channel. - Inquiry regarding suitable LLM inference framework for CPU-only AWS instances proposed by
@nosa_.
. - Creation of a new channel in the performance category was reported by
@jeffreyw128
. - Performance issues:
@kev.o.
reported slower performance and issues using Azure. - Strategies and challenges in AI prompting:
@pantsforbirds
shared a GitHub link for generating system prompts for Chat-GPT and asked about handling'null'
values in GPT-4's JSON responses.@joshcho_
sought advice on controlling response length. Research and methods on enhancing GPT-4 performance for medical benchmarks were also shared: Research Tweet.
LLM Perf Enthusiasts AI Channel Summaries
▷ Channel: general (6 messages):
- Context Limits with Dedicated Instances: User
@ampdot
inquired about the context limits with dedicated instances and received clarification from@res6969
that these limits stay the same - 8k/32k/128k. - Introduction of PPLX-Online-LLMs by Perplexity.AI:
@robhaisfield
shared a link to the blog post about the introduction of PPLX-Online-LLMs by Perplexity.AI. Offering models such as Mistral, codellama 34b, llama 70b, and search-powered models, the API is reported to have a super-fast response time. - Comparison to Google Search:
@thebaghdaddy
compared Perplexity's search tool favorably to Google search for topic exploration. According to him, Google now serves mostly for finding restaurant phone numbers. - Use of Perplexity for Topic Exploration: When
@justahvee
asked for clarification,@thebaghdaddy
further expounded on the topic exploration with Perplexity on the free plan, explaining the process of asking increasingly specific questions and profiting from the referenced links provided by Perplexity to gain a deeper understanding of unfamiliar fields, like gene therapy.
▷ Channel: gpt4 (8 messages):
- Use of Natural Language Systems in Production:
@res6969
expressed skepticism about using a Natural Language System in production due to susceptibility to prompt injection, while@nosa_
shared experiences of using local models for PII redaction at a privacy startup, with potential improvement through serious fine-tuning. - Chaining Two Models for PII Redaction:
@pantsforbirds
suggested an idea of chaining two models, the first one for retrieval and the second one for redacting PII with a hard-coded prompt. - Fine-Tuning Models for PII Redaction:
@nosa_
clarified that they have successfully fine-tuned models for PII redaction, and proposed that a well-finetuned model could generalize on the notion of "private info" that's more general than static PII, despite current poor results. - GPT-4 and PII Redaction:
@pantsforbirds
stated that GPT-4 does a decent job at removing PII, but achieving only 85-90% isn't satisfactory. The user speculated about the possibility of seeing chained model queries to navigate prompt engineering attempts. - GPT-4 Performance:
@potrock
shared a Twitter link about hopeful comments from OAI staff on lazy GPT-4.@.psychickoala
noticed a slowdown in Azure OpenAI calls.
▷ Channel: finetuning (3 messages):
- Fine-tuning for new
tools
andtool_calling
:@robertchung
asked if anyone has done fine-tuning for the newtools
andtool_calling
aspects of OpenAI yet.
- Long context samples in 3.5 Fine-tuning:
@hassantsyed
asked for clarification about the maximum sample size for 3.5 fine-tuning, referencing the OpenAI docs and noting a discrepancy between the stated 4k max sample size and a later statement supporting up to 16k context examples. He provided a link to the token-limits guide.
- Answer to Long Context Samples Confusion:
@robertchung
responded to@hassantsyed
's question, indicating that he found the answer on the github openai-cookbook page.
▷ Channel: opensource (4 messages):
- Best Opensource Image to Text Model :
@jeffreyw128
asked for recommendations for the best opensource image to text model because of issues with gpt4-vision.@__polarbear
recommended using native Mac OS OCR. - Increasing Token Limit on llama2 :
@dongdong
enquired about increasing the token limit on llama2. A potential workaround suggested by@joshcho_
is to use progressive summarization.
▷ Channel: collaboration (1 messages):
- Job Opportunities: User
@blakeandersonw
mentioned opportunities for part/full time work with cash + equity compensation.
▷ Channel: speed (1 messages):
- LLM Inference Framework Query: User
@nosa_.
asked for recommendations on a favorite LLM inference framework specifically for CPU-only AWS instances.
▷ Channel: feedback-meta (1 messages):
- New Channel Creation: User
@jeffreyw128
reported the creation of a new channel with the ID#1179271229593624677
in the performance category.
▷ Channel: openai (2 messages):
- Slower Performance:
@kev.o.
reported that they are still experiencing slower performance than usual. - Issue on Azure:
@kev.o.
also reported further issues, experiencing some difficulties while using Azure.
▷ Channel: prompting (8 messages):
- System Prompt Generation for Chat-GPT:
@pantsforbirds
shared a link to a GitHub resource that illustrates an approach to generate system prompts for Chat-GPT, available at LouisShark/chatgpt_system_prompt. - Issue with
null
Values in GPT-4's JSON Responses:@pantsforbirds
asked for advice on managing the issue of'null'
strings appearing in the JSON responses from GPT-4. - Controlling Response Length in Conversational Agents:
@joshcho_
sought advice on how to prevent characters from providing extensive responses when asked for information.@dongdong
suggested incorporating a character or word count limit in the prompts. - Promoting for Medical Benchmarks:
@pantsforbirds
discovered and shared research on advanced prompting techniques that improve the performance of GPT-4 for medical benchmarks. The result mentioned in the tweet is particularly significant: Research Tweet - Interpretation of Medical Task Method:
@thebaghdaddy
offered insight on the medical task method, stating it relies on text embedding. They appreciated the shared information and indicated plans to conduct tests based on it.
MLOps @Chipro Discord Summary
- Shared community events on platforms: @jaskirat posted a link to a Luma event in the event channel.
- Discussion on anti-spam measures: User @gramaras referenced numerous anti-spam measures in the context of a conversation within the general ML channel.
- Knowledge sharing on SIMD Base64: @mattrixoperations shared an informative article on SIMD Base64 in the general ML channel.
- Employment opportunities: @wangx123, the CEO of a start-up, extended an invitation to anyone interested to join their company in the general ML channel.
MLOps @Chipro Channel Summaries
▷ Channel: events (1 messages):
- Event Link Posted:
@jaskirat
shared a link to an event on the platform Luma.
▷ Channel: general-ml (3 messages):
- Anti-spam Measures: User
@gramaras
mentioned the existence of numerous anti-spam measures in the context of a conversation. - SIMD Base64:
@mattrixoperations
shared an informative article on SIMD Base64. - Job Opportunity at a Start-Up:
@wangx123
, the CEO of a start-up, invited those interested to consider joining their company.
The Ontocord (MDEL discord) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
The AI Engineer Foundation Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
Perplexity AI Discord Summary
Only 1 channel had activity, so no need to summarize...
Perplexity AI Channel Summaries
▷ Channel: announcements (1 messages):
- Announcement of pplx-api coming out of beta and move to usage-based pricing: User
@ok.alex
informed that Perplexity's pplx-api is now out of beta and will be moving to usage-based pricing. This includes the live LLM APIs that are grounded with web search data and have no knowledge cutoff. More details onhttp://pplx.ai/online-llms
.
- Introduction of the "online" models: The new models, pplx-7b-online and pplx-70b-online, have been trained in-house by building on top of Mistral and Llama 2, and fine-tuned for accuracy and helpfulness.
@ok.alex
stated these models are believed to surpass GPT-3.5 and Llama 2 in answering questions with search grounding.
- Provision of pplx-api: With pplx-api coming out of beta, users can now access pplx-online, pplx-chat, and open-source LLMs like Mistral and Llama 2 with Perplexity's in-house infrastructure, on usage-based pricing. Perplexity 👁 users are entitled to a $5 monthly credit. Inquiries should be directed to
api@perplexity.ai
.
- Announcement about AWS re:invent keynote speech: Perplexity's team will be on the keynote stage at the Amazon Web Services (AWS) re:invent to shed more light on this. The keynote can be viewed at
https://go.aws/3GbYl4U
.
The YAIG (a16z Infra) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
x