SAIL: Things we know, Knowledge graphs. And Agents.
July 13, 2025
Welcome to Sensemaking, AI, and Learning (SAIL). I focus on AI and higher education.
There is a tension emerging between breathless hype of AI potential and the practical reality of it on the ground when creating learning experiences and software (there is a third element that involves a critical and ethical stance on how AI impacts humanity, enables or ignores the needs of under represented learners, and longer term environmental impacts). Certain things are becoming clearer and these include:
Technology creates problems that only more technology can solve
AI is capable of performing basic cognitive tasks (such as writing, coding, and image generation) that intersect with human cognitive process and move the fulcrum of some knowledge work up at least one abstraction layer.
AI has significant limitations in performance that make is unreliable for prime time performance without supports (currently).
The value of today’s AI will be defined by the systems-level infrastructure that universities build around it.
The agent stack (LLM, memory, discrete tasks) is the highest priority area for universities to figure out that will add value to learners.
AI and Learning:
There are simply stunning innovation opportunities available in the education space as laggard systems begin to hire AI leaders (Director of AI, VP of AI, Chief of AI, etc). Times of transition produce new leaders (i.e. on premise Blackboard to cloud Instructure). Similarly, the AI landscape for startups in education has many avenues to improve parts of the university system - from admission to tutoring to counseling to assessment. I’m expecting that many of these functions, due largely to universities having challenges getting their legs under them in this race, will be taken on by external providers (hello, new OPM-ish companies). However, there are many in-house capabilities that future-oriented universities will develop. A large part of that comes from AI engineers or similar roles where on-ground and in collaboration with faculty technical roles will identify and solve practical problems. A short sampling of skills they’ll need: working with Rag, conducting evals, managing guardrails, fine tuning, agentic frameworks, etc.
Personalization to each learner is a key educational goal. When we have small classrooms, faculty manage personalization through in-classroom teaching. When we scale learning to hundreds or thousands of learners, this is economically inaccessible. LLMs offer a whole new range of options - both in personalizing to the attributes and needs of a learner and in personalizing curriculum to specific learner knowledge needs. Knowledge graphs are a key part of that process.
The social concerns around AI’s impact on work are being felt in big tech. The latest is Amazon. As I shared a few months ago “We will need fewer people doing some of the jobs that are being done today, and more people doing other types of jobs.”
Building multi-agent systems If you’ve tracked Anthropic, OpenAI, Devin and others on this, nothing much new here. But it is a nice summary. PwC lists agents as a key technology in their 2025 Business Predictions, suggesting that they will allow organizations to double their workforce.
What gets measured AI will automate. “If you can shoehorn a phenomenon into numbers, AI will learn it and reproduce it back at scale—and the tech keeps slashing the cost of that conversion, so measurement gets cheaper, faster, and quietly woven into everything we touch”. Yes.
I’ve been reflecting on this statement “we’re neglecting the factthat humans are social animals and that their intelligence is partly social in origin. We’re also treating the social consequences of technology as an afterthought which is unacceptable given the massive effect that current technology is poised to have on society” from this paper: A Collectivist, Economic Perspective on AI. It’s worth a read - not because it’s a great paper (it’s not - it flirts with social notions but doesn’t get much past computational thinking) - but because it raises topics about how we want to be as humans in this emerging world. Especially since many of the top thinkers seem to carry non-humanistic lenses and language.
AI’s impact on productivity for software developers: “we find that when developers use AI tools, they take 19% longer than without—AI makes them slower.” Ouch.
General AI Technology
The most important news in AI this week was Grok. Both good (very good) and bad. If you spend time on Twitter, you’ll notice both Perplexity and Grok are active in conversations - providing clarifying responses when a users calls them. Grok went off the rails and the concerns and unease raised are profound. Grok has provided an update on what happened behind the scenes. This is one of the more important threads I’ve read lately because it clearly communicates how fragile the system is and if we are having trouble containing or preventing undesirable behavior with today’s model, the future looks alarming. The last incident with Grok resulted in Grok publishing its prompts. Some users have noted Grok consults Musk’s tweets in replying on at least controversial topics.
The Generative AI stack. There will be many of these, but this one is needlessly complex. I prefer Andrew Ng’s (building faster with AI) at the 4:30 min mark.
The AI community is abuzz with the launch of Kimi K2. It is to agentic AI as DeepSeek is to reasoning. Chinese companies are driving innovations and new algorithms. In the case of Kimi, it’s big focus is on muon - a neural network optimizer. Here’s a quick (but technical) dive. Btw, I’m increasingly finding value in having conversations with Claude, ChatGPT, or Gemini in unpacking and understanding more technical papers.
Windsurf (a coding tool like Cursor or Claude Code) had been rumored to be acquired by OpenAI for $3b. Apparently friction with Microsoft (early and main funder of OpenAI) delayed that and Google swept in. Concerns for staff and early engineers are being raised since they don’t appear to be in the “reap the rewards” line of sight.
DSPy has been rising in prominence in online discussions. They have a simple framing for “programming-not prompting-your LLMs”. This short video is a good introduction and starts with the assertion that organizations are still learning how to build reliable AI software. It leads into practical data and tech decisions that need to be made in building an AI product. A longer video (unrelated to the one above) is here.