Harness Your Agents, Part Two
Hi all,
Happy Pi Day!
As promised, this is the second part of this week's newsletter.
Also: I'm launching an agent workshop series soon - AGT NYC members have priority (and free tickets) - reply to this email if you'd like to collaborate.
Cheers,
Ivan
Invite a friend to join AGT NYC at agtnyc.com!
Events
- Apr 2 - MCP in the Wild: The Future of Data Agents in Production (NYC)
- Apr 4 - Personalized Agents Hackathon
- Apr 8 - Tech Talk: Building Agentic Media Buying Platforms
- Apr 10 - Enterprise Agents Hackathon
- Apr 16 - NYC Product Leaders Breakfast: Let the AI Agents Work (but on what?)
- May 4 - AI Agent Conference
- May 12 - AGENT 2026, Brazil
- Jun 6 - Agentic & Applied AI for the Enterprise Conference, Atlanta
- Aug 4 - Ai4, Las Vegas
- Oct 26 - Oracle AI World, Las Vegas
- Oct 28 - AGENTICS 2026, France
You can find more events on the AGT NYC Luma calendar.
News
- Nvidia plans open-source AI agent platform ‘NemoClaw’ for enterprises: Wired
- AI vs AI: Agent hacked McKinsey's chatbot and gained full read-write access in just two hours
- State of Agentic AI Report: Key Findings
- This AI agent freed itself and started secretly mining crypto
- Orchestrated Multi-Agent AI Systems Outperforms Single Agents in Health Care
I think we'll see many more stories about agents going rogue and hacking organizations (and individuals) in the future. Now that the tech is becoming more accessible, there are bound to be cases of it going off the rails.
Launches
- AWS launches a new AI agent platform specifically for healthcare
- Clawcard: inbox, number, card for agents
- ADP Marketplace Launches AI Agents to Help Make Work Easier, Smarter
- monday.com Welcomes AI Agents to Its Platform, Marking a Shift in How Work Gets Done
Companies in legacy industries keep rolling out agents and there's good reason to think that will continue. What will be more difficult for them and much easier for startups is being agent-native, and I doubt any large company will try to take that on any time soon.
Deals
- Armadin ($190M) - agents for security
- Gumloop ($50M) - agent builder platform
- Lio ($30M) - agents for procurement
- Amigo AI ($11M) - clinical agent training
- AgentMail ($6M) - email for agents
- Promptfoo acquired by OpenAI
Interesting that a first-gen agent company like Gumloop is still able to raise money. I wonder how adjacent companies, e.g. CrewAI, are going to weather the changes brought on by Claude Code and other emerging agent harnesses.
Articles
- Why I Ditched OpenClaw and Built a More Secure AI Agent on Blink + Mac Mini by Eric Paulsen
- Filesystems are having a moment by Daniel Madalitso Phiri
- Agent Orchestration UI by Luke Wroblewski
- Harness engineering: leveraging Codex in an agent-first world by OpenAI
- From model to agent: Equipping the Responses API with a computer environment by OpenAI
- Lessons from Building Claude Code: Seeing like an Agent by Thariq Shihipar
- OpenAI Operator scores 43% on hard web tasks. We scored 81%. Here are all 300 runs. by TinyFish
- How we built LangChain's GTM Agent
- Improving Deep Agents with Harness Engineering by Vivek Trivedy
- AI agents are ‘aeroplanes for the mind’: five ways to ensure that scientists are responsible pilots in Nature
- Agentic UX: 7 principles for designing systems with agents by Alexandra Vasquez
There is a lot to absorb around harness engineering and the code-first approach of the current wave of agents. I wonder if we're going to consolidate on a few patterns/tools over the next several months - there can't be that many unique approaches.
Projects
- Understudy - teachable desktop agent
- Wardgate - agent security gateway
- PinchTab - agent Chrome control
- Awesome Claws - OpenClaw-inspired agents
- OpenStinger - agent memory harness
- Jido - Elixir agent framework
- Hodoscope - agent trajectory analysis
A good deal of advanced agent tech is coming out both at the application layer and the infrastructure layer. The bare minimum required to build and operate agents seems to increase on a monthly basis. The "hello world" basics of 2025 are a distant memory now.
Learning
- Practical Guide to Evaluating and Testing Agent Skills
- Building Reliable Agents
- Agent Harness
- Why AI Agents Need A Human in the Loop Now [7:26]
- Elements of AI Agents
- Agent Experience
- LangChain & LangSmith Skills: Teach Your AI to Build Agents [4:40]
It's good to see learning materials get shorter and more practical. There's also been enough relevant content lately that agents could be taught to build agents fairly easily.
Research
- RuneBench: Agent Benchmark on RuneScape Gameplay Tasks
- SoK: Agentic Skills -- Beyond Tool Use in LLM Agents
- SkillCraft: Can LLM Agents Learn to Use Tools Skillfully?
- EvoSkill: Automated Skill Discovery for Multi-Agent Systems
- Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing
- Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data
- EmbeWebAgent: Embedding Web Agents into Any Customized UI
- Multi-agent cooperation through in-context co-player inference
- AI agent in healthcare: applications, evaluations, and future directions
Very interesting to see research focus on skills (and advancing research on tool use in general). What once started out as a simple feature in Claude Code is now paving the path to sophisticated agent development.
Comments, suggestions? Reply to this email, let me know what you think!