How an AI SRE agent can manage 20,000 workloads

English
This talk will be held in English. / Dieser Vortrag wird auf Englisch gehalten.

IT systems are not neat self-contained boxes with a few components. A single application deployment has hundreds of moving parts. An IT landscape contains thousands of them.

On the other side sit AI agents with vast world knowledge. Like humans they struggle with context overload. They can't keep a whole datacenter in their head at once.

Neither can we. We navigate.

In this talk we'll show how an agent can navigate and operate hundreds of thousands of components, working from a live map of the infrastructure and a connector layer that acts on what it reveals.

The result: an agent that adapts itself to the landscape it walks into, instead of a bespoke setup for every shape of IT.

  • Basic familiarity with modern IT operations: Kubernetes, Observability Stacks (Prometheus, Loki), and the usual Alert-to-Incident Workflow.
  • Some exposure to LLMs or AI agents is useful but not required.
  • The talk is conceptual with a live demo. No coding knowledge needed.

  • Why large LLM context windows alone don't solve IT complexity.
  • How a precomputed infrastructure map lets an agent navigate hundreds of thousands of components.
  • How a connector layer turns that map into action.
  • Where the approach breaks.
  • And how to tell if it's worth trying in your own environment.
Benjamin Hofmann Benjamin Hofmann brings deep expertise in software testing, engineering, and cloud architectures, with a strong focus on GenAI. He has worked across microservices, platform engineering, and observability, championing Agile and DevOps along the way. As co-founder of Hyground, he's pushing generative AI forward in complex cloud operations.