SAIL: Understanding LLMs, Learning as Struggle
Welcome to Sensemaking, AI, and Learning (SAIL). I explore AI trends that impact higher education.
AI and Learning
Deep Dive into LLMs Like ChatGPT. If you want a rapid (well, 3 ½ hours) explanation for a general audience about how LLMs work, this is amazing. Every human being should watch this. It should be foundational knowledge. It distills the last two years of LLM advancement into a video (yes, he touches on DeepSeek and reasoning models).
You need to struggle to learn: "By “struggle”, we mean the effort students put into understanding a concept and working through challenges to uncover solutions that are not immediately obvious. While we may not enjoy the struggles of learning, struggle teaches persistence and deepens understanding. Our worry is that with AI, we may develop a habit of avoiding struggle, and that habit risks eroding the depth of our knowledge”. Karpathy calls this “mental equivalent of sweating”.
Here’s an interesting tool that gives power to the learner: Alice. A tool for students to upload learning resources and turns them into personalized learning, assignments, exercises, and summaries.
CalState is positioning itself as an AI-centric university: “The CSU’s unprecedented adoption of AI technologies will make trainings, learning, and teaching tools—including ChatGPT—available across all 23 CSU universities, ensuring that the system’s more than 460,000 students and 63,000 faculty”. With announcements like this, which no doubt will be welcomed by students and faculty simply in terms of access to technology, I’m always looking for what changes to existing practices (at a systems level) that the investment will make. It’s not clear from this release. They have also created an AI Acceleration Board.
AI Development
Hugging Face is the largest AI community. Here is a good interview with the CEO of Hugging Face on the role of open source AI. They have what is likely the largest AI app “store” as well. A good place to explore tools and models from image generation to text to video to speech to code to 3D modeling.
AI Engineering a Podcast with Chip Huyen. Huyen released AI Engineering earlier this month. And wrote this excellent piece on AI agents.
LLMs have significant expense at two stages. One is during pretraining (this is where hundreds of millions of dollars in compute is needed) and the second stage is at deployment (or inference) where the model responds to users requests. DeepSeek’s announcement recently called into question existing expectations that both stages are prohibitively expensive for small companies. Once a model has been trained and launched, it can improve performance by using tools such as websearch (ChatGPT had a cutoff date at launch that trailed many months) or by engaging in reasoning. Reasoning is what has given us o1/o3 and R1. What is reasoning? Here’s a terrific overview of reasoning models. And a fantastic lecture on how DeepSeek changes the LLM Story.
Google dropped a range of updated Gemini models. I like Gemini for mainly large report analysis or Youtube video analysis. Their reasoning model has been less useful than R1 or OpenAIs. Their Deep Research model is good overall, but it pales in comparison with OpenAI’s Deep Research (good work on naming all). Gemini has the best performance for price ratio.
Speaking of Google, they, like OpenAI, are back in the war game.
Deepfakes are about to get insane. Bytedance (of TikTok fame) has a new paper on human generated videos from a single picture.
AI Agent Index “we introduce the AI Agent Index, the first public database to document information about currently deployed agentic AI systems.”
Benchmarks allow for performance comparisons of LLMs. There are a growing number of them available for and this resource captures them.