Fun with Agents
Hi all,
Got a special midweek edition for you today. There is still a lot of material to go through, the agent industry has been busy the last few weeks. Hope you're enjoying NYTW so far, only 1000 more events to go!
Cheers,
Ivan
Invite a friend to join AGT NYC at agtnyc.com!
Events
- Jun 3 - Hard-Won Lessons from Deploying AI Agents in Regulated Environments - #NYTechWeek
- Jun 4 - Agentic AI Summit
- Jun 5 - Agent State Happy Hour - #NYTechWeek
- Jun 10 - Arize Builders Meetup - NYC
- Jun 16 - AWS Summit New York Welcome Party: Agents at Scale
- Jun 23 - Daytona AI Builders w/ Oracle & Datadog
Find more events on the AGT NYC Luma calendar.
Launches
- NVIDIA Releases Major Collection of Open Source Agent Tools and Skills for Physical AI
- Datasette Agent, an extensible AI assistant for Datasette
- Introducing Proton Pass for AI agents: The password manager for AI that keeps you in control
- Sweet Attack - the first AI red team agent
- Bevaya launches AI agent platform for insurance
This one's a grab bag across various spaces. It's cool to see projects that are utilitarian, like password managers and data explorers, get into agents, too.
Deals
- Netomi ($110M) - agentic customer experience
- Cosmon ($31M) - mechanical engineering agents
- Nace.AI ($21M) - enterprise workflow agents
- Tribal ($10M) - agents inside enterprise
- Astrix Security acquired by Cisco
These companies may seem a bit stodgy, but there's something interesting and unexpected about all of them. Maybe enterprise can be fun with agents?
Projects
- Koog 1.0 - Kotlin agent framework
- Quarq Agent - memory-first agent
- Dexto - open agent harness
- Pipelock - AI agent firewall
- agent-desktop - agent desktop automation
If last year everyone was releasing an agent framework, this year everyone is releasing an agent harness. Projects that have understood the vibe has shifted are spinning out their harnesses as fast as possible.
Learning
- Agent Optimization with Pydantic AI: GEPA, Evals, Feedback Loops — Samuel Colvin, Pydantic [1:20:39]
- Evaluating Deep Agents using LangSmith on AWS
- Self-Training Agents: Hermes Agent, HF Traces, Skills, MCP & Finetuning — Merve Noyan, Hugging Face [19:10]
- Better Harness: A Recipe for Harness Hill-Climbing with Evals
- Agents on the Canvas in tldraw — Steve Ruiz, tldraw [19:53]
The general theme of these links is "agent optimization," which I think will become more important over time. If the major names in open source are driving in this direction, this is a big sign of what's to come. The great thing about it is that there is a lot to optimize with agents, they have a lot of moving parts.
Research
- ECHO: Terminal Agents Learn World Models for Free
- Agent Harness Engineering: A Survey
- BenchLLM: Agent & Tool-Use Benchmarks
- Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations
- Latent Agents: A Post-Training Procedure for Internalized Multi-Agent Debate
One interesting thread lately is making use of the "data exhaust" that agents, and AI applications in general, produce. Whether it's traces or logs or just the end result of some activity, it turns out this provides valuable context that can meaningfully improve performance. Maybe agents do their best in a full-context noisy environment.
Comments, suggestions? Reply to this email, let me know what you think!