SAIL: Agents, Agents, Agents
Welcome to Sensemaking, AI, and Learning (SAIL).
Likely no trend in AI will be more substantive this year than agents. After significant advancement in the capabilities of foundation models, application of those models as part of a value ecosystem (including approaches like LLMs as an operating system) are the next logical step.
I humbly recommend educators/faculty/admin/designers get comfortable with agents as both a concept and a technology.
Agents have a long history in AI, but were best explained in Russell & Norvig’s seminal text Artificial Intelligence: A modern approach. They defined agents as being able to “operate autonomously, perceive their environment, persist over a prolonged time period, adapt to change, and create and pursue goals”. Andrew Ng somewhat aligns with this, but focuses less on autonomy and agent selecting goals. Most of what we see described as agents is generally just a tool wrapped around GPT/Gemini/Claude models and would more likely just be described as prompts.
Before diving into agents, here is an existential thread worth thinking about. Highlights include: “The world isn't grappling enough with the seriousness of AI and how it will upend or negate a lot of the assumptions many seemingly-robust equilibria are based upon…Then it will force changes in philosophy. What are we here for? Why do we do the things we do? If everything we care about is automatable, what is our role in the world?”
A few resources to do an Agent dive so you’re ready for what’s coming our way in higher education.
Google has a whitepaper from Sept 2024 on agents. If you read one resource, make it this one. They detail tools, orchestration, models vs agents, etc. The section on data stores will be particularly helpful for education (i.e. curriculum and learner data).
The Agent Stack. “The agent software ecosystem has developed significantly in the past few months with progress in memory, tool usage, secure execution, and deployment, so we decided it was time to share our own “agent stack” based on our own learnings from working on open source AI for over a year and AI research for 7+ years.”
The corporate sector is all in on agents. Salesforce has launched agent force. Microsoft says you’ll have a team of agents working for you by this time next year.
And what kind of tasks can real world agents do? Here’s a detailed list. They offer a benchmark for real world tasks and track how well agents achieve them. Interesting to note two areas of common agent failure: social skills and common sense. Success rate at tasks is somewhat low - only 24%.
Princeton lead a workshop on agents earlier this year. The recording is here. The discussion on infrastructure for agent development is particularly important.
Microsoft has launched Magentic One - A multi-agent orchestration framework. It offers a good visual of how agents (coder, websurfer, filesurfer) are called and orchestrated.
In higher education, how will we design for agents? A few thoughts together with a colleague Mihnea Moldoveanu: Interactionalism: Re-Designing Higher Learning for the Large Language Agent Era
When worlds collide. A MOOC on LLMs. Scroll down for an excellent series of lectures. Several of the weeks are specifically focused on agents and agentic frameworks.
Automated design of agentic systems. Excellent. Great overview visuals early in the doc. Practical search agent example.
AI Agents that Matter. Authors posit that agents require different evaluations/benchmarks from existing LLM evaluations.
smolagents. Huggingface offers a library to build agents with only a few lines of code.
Devin was a heavily touted coding agent in early 2024. When it released toward the end of the year, it dropped with a $500/mo price. Open Hands is an open model in response.
State of AI Agents. Accessible overview of the state of agents in organizations today.
General AI News
AI will lead to abundance. So says Ray Kurzweil
The investment in AI from big tech and VCs remains completely insane. Microsoft will drop $80b in 2025 on data centers. There are only a few companies and countries that have the capacity to invest at this level. And they will own the future.
Building large language models. Short Stanford lecture. Nice overview
How might LLMs store facts. This is excellent. 20 min video
The Prompt Report. A survey of prompting techniques. A good section on non-English prompting as well.
Random point: if you’re on Twitter, Grok is one of the best LLM implementations that I have seen in adding value to an existing platform. Meta cancelled their AI character program. There is a lesson here for higher education: Adding AI into an existing platform is fraught with user pushback.
AI Engineer 2025 Reading list. This is gold.
Altman says OpenAI knows the path to AGI. “We are now confident we know how to build AGI as we have traditionally understood it.”