40 minutes of investigation in 8 seconds
Hey,
Article 7 of the AI-Native IDP series is live:
AI-Assisted Incident Response
At 3am the invoice-api returns 500 errors. The on-call engineer opens the logs, scrolls through pages of JSON, checks the database, checks the connection string, checks recent deployments. 40 minutes later, they find the problem: someone changed Host= to Server= in the config.
All of that information was already in the system. The catalog knows the dependencies. GitHub has the deployment history. GOTCHA.md has the rules. Nobody connected the dots.
In this article:
A Backstage plugin that gathers context from the catalog, GitHub commits, and GOTCHA.md
An AI endpoint that analyzes errors with full service context
A webhook for your monitoring system (Alertmanager, Grafana, etc.)
An IncidentCard on entity pages for on-demand analysis
The engineer still decides. But instead of starting from zero, they start with a hypothesis.
Code is on GitHub: victorZKov/forge
Read it here: AI-Assisted Incident Response
Next up: The Reference Architecture — how all 7 AI features connect, how to deploy them, and how to extend the platform.
Victor