Meet LiteLLM Agent Platform: A Kubernetes-Primarily based, Self-Hosted Infrastructure Layer for Remoted Agent Sandboxes and Persistent Session Administration in Manufacturing
Operating AI brokers in an area script is simple. Operating them reliably in manufacturing throughout groups, throughout restarts, with remoted environments per context is a special downside solely. BerriAI, the corporate behind the LiteLLM AI Gateway, is now open-sourcing a purpose-built reply to that downside: the LiteLLM Agent Platform. The platform is described as a easy, self-hosted infrastructure platform for operating a number of brokers in manufacturing.
What Drawback Does it Remedy?
It helps to know what occurs whenever you attempt to scale brokers past a single course of. Brokers are stateful: they carry session historical past, device name outcomes, and intermediate reasoning throughout turns. If the container operating your agent crashes, restarts, or will get changed throughout a deployment, that session state is gone except one thing is explicitly managing it. On the identical time, completely different groups typically want completely different runtime environments, completely different instruments, completely different secrets and techniques, completely different entry scopes which suggests you can’t throw all brokers into one shared container.
The platform manages two issues: per-team and per-context sandboxes, and session continuity throughout pod restarts and upgrades. These two capabilities are the core infrastructure primitives the platform offers.
Structure and Technical Stack
The platform is a standalone Subsequent.js dashboard for LiteLLM v2 managed brokers, protecting classes chat, agent CRUD, and dwell standing. The codebase is primarily TypeScript (92.8%), with Shell scripts for provisioning, a Dockerfile for containerization, and CSS for the dashboard UI.
The structure separates considerations cleanly. A net course of runs on port 3000 and serves the Subsequent.js dashboard. A employee course of handles async agent duties. Postgres is used because the persistent backing retailer, and a schema migration runs as an init container on startup — so the database is all the time within the right state earlier than the applying boots.
For the sandbox layer — the remoted runtime surroundings the place brokers really execute — sandboxes run on Kubernetes through the kubernetes-sigs/agent-sandbox CRD. Native growth makes use of form. If you’re not already accustomed to it: form (Kubernetes in Docker) enables you to spin up a full Kubernetes cluster domestically utilizing Docker containers as nodes, without having a cloud supplier. The agent-sandbox CRD (Customized Useful resource Definition) is a Kubernetes extension from kubernetes-sigs that the platform installs to handle the lifecycle of particular person sandbox environments.
The platform additionally features a harness system underneath harnesses/opencode, which accommodates the configuration for operating coding brokers — similar to Claude Code or OpenAI Codex — inside remoted sandboxes with a vault proxy for credential administration. BerriAI staff additionally maintains a separate litellm-agent-runtime repository, described as a coding-agent runtime that runs inside per-session VMs provisioned by a LiteLLM proxy, generic by design, with customization occurring through harness configuration or a hydrate payload.
One sensible element price noting is how surroundings variables are dealt with throughout sandbox containers. Something in .env prefixed with CONTAINER_ENV_ is injected into each sandbox container with the prefix stripped — for instance, CONTAINER_ENV_GITHUB_TOKEN=ghp_... means the container sees GITHUB_TOKEN=ghp_... This offers groups a clear option to cross secrets and techniques into sandboxed agent classes with out modifying container photos.
https://github.com/BerriAI/litellm-agent-platform
Getting Began
The stipulations for native growth are Docker Desktop, form, kubectl, helm, and a LiteLLM gateway. No cloud credentials are required to get began domestically. The quickstart is 2 instructions:
bin/kind-up.sh
docker compose up
bin/kind-up.sh is idempotent — it provisions a sort cluster named agent-sbx, installs the agent-sandbox controller, and hundreds the harness picture. docker compose up boots Postgres, runs the schema migration, and begins the net course of on port 3000 together with the employee.
For manufacturing deployment, the really useful path is AWS EKS for the sandbox cluster and Render for the net and employee processes. bin/eks-up.sh provisions the EKS cluster, and a Render Blueprint offers a one-click deployment possibility.
Relationship to the LiteLLM Gateway
The Agent Platform is a layer on prime of the prevailing LiteLLM ecosystem, not a alternative for it. LiteLLM’s core is a Python SDK and Proxy Server — an AI Gateway — that calls 100+ LLM APIs in OpenAI format, with value monitoring, guardrails, load balancing, and logging, supporting suppliers together with Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, SageMaker, HuggingFace, vLLM, and NVIDIA NIM. The Agent Platform consumes a operating LiteLLM gateway as a dependency and builds agent orchestration and session administration infrastructure on prime of it. Mannequin routing, value monitoring, and charge limiting stay within the gateway layer. Sandbox isolation, session continuity, and the administration dashboard are dealt with by the Agent Platform.
Marktechpost’s Visible Explainer
Overview
Ideas
Structure
Conditions
Quickstart
Manufacturing
01 / 06
What’s LiteLLM Agent Platform?
BerriAI open-sourced this platform on Could 8, 2026. It’s a self-hosted infrastructure layer for operating a number of AI brokers in manufacturing, constructed on prime of the LiteLLM AI Gateway.
🧱
Self-Hosted
Runs solely by yourself infrastructure. No information leaves your surroundings. Suited to regulated industries and groups with information residency necessities.
🤖
Multi-Agent
Designed to run a number of brokers in parallel, with full isolation between groups and contexts utilizing per-session sandboxes.
🔁
Session Continuity
Agent classes persist throughout pod restarts and upgrades, so stateful work isn’t misplaced when containers are changed.
⚡
Open Supply (MIT)
Absolutely open supply underneath the MIT license. Repo: github.com/BerriAI/litellm-agent-platform. File points and contribute straight.
Prerequisite Information
This information assumes familiarity with Docker, primary command-line utilization, and a normal understanding of what an AI agent is (a mannequin that calls instruments and runs multi-step duties). Kubernetes expertise helps however isn’t required to comply with alongside.
02 / 06
Key Ideas to Know First
Earlier than operating the platform, perceive these 4 constructing blocks. They seem all through the setup and configuration.
A
LiteLLM Gateway
The underlying AI Gateway that the Agent Platform is determined by. It routes requests to 100+ LLM suppliers (OpenAI, Anthropic, Bedrock, VertexAI, and many others.) utilizing a unified OpenAI-format API. The Agent Platform doesn’t embody the gateway, you have to have one operating individually and level the platform at it.
B
Sandbox
An remoted container surroundings the place a single agent session executes. Every sandbox is impartial, that means one agent can not entry the filesystem, secrets and techniques, or state of one other. Sandboxes are provisioned and torn down per session utilizing the kubernetes-sigs/agent-sandbox CRD (Customized Useful resource Definition).
C
Harness
A configuration layer that defines how a particular sort of coding agent (similar to Claude Code or OpenAI Codex) runs inside a sandbox. The platform ships with an opencode harness underneath harnesses/opencode/. The harness picture is loaded into the sort cluster throughout setup.
D
CRD (Customized Useful resource Definition)
A Kubernetes extension that allows you to outline new useful resource varieties. The platform makes use of the kubernetes-sigs/agent-sandbox CRD to show your Kubernetes cluster how one can handle agent sandboxes as first-class sources, the identical approach it manages pods or deployments.
03 / 06
How the Platform Is Structured
The platform has 4 foremost elements. Understanding how they join helps when debugging or deploying to manufacturing.
Part
What It Does
Tech
net (:3000)
Subsequent.js dashboard. Supplies the UI for classes chat, agent CRUD operations, and dwell standing monitoring.
Subsequent.js, TypeScript
employee
Background course of that handles async agent duties, decoupled from the net server.
TypeScript
postgres
Persistent backing retailer for session state, agent configs, and metadata. Schema migration runs routinely as an init container on startup.
PostgreSQL
sandbox cluster
Kubernetes cluster the place particular person agent sandboxes run, managed through the agent-sandbox CRD controller. Domestically: form. In manufacturing: AWS EKS.
Kubernetes (form / EKS)
Separation of Considerations
The LiteLLM gateway handles mannequin routing, value monitoring, charge limiting, and guardrails. The Agent Platform handles sandbox lifecycle, session administration, and the administration dashboard. They run as separate companies and the Agent Platform consumes the gateway as a dependency.
04 / 06
Conditions Earlier than You Begin
Set up and confirm these instruments earlier than operating any setup instructions. The quickstart is not going to work with out all 5.
1
Docker Desktop
Required to construct and run containers, and to energy form (which runs Kubernetes nodes as Docker containers). Obtain from docker.com/merchandise/docker-desktop. Confirm with:
docker --version
2
form (Kubernetes in Docker)
Used to provision an area Kubernetes cluster for operating sandboxes. Set up through Homebrew on macOS (brew set up form) or from form.sigs.k8s.io. Confirm with:
form --version
3
kubectl
The Kubernetes command-line device. Utilized by the setup scripts to work together with the sort cluster. Set up from kubernetes.io/docs/duties/instruments. Confirm with:
kubectl model --client
4
helm
The Kubernetes bundle supervisor. Used to put in the agent-sandbox controller into the sort cluster. Set up from helm.sh/docs/intro/set up. Confirm with:
helm model
5
A Operating LiteLLM Gateway
The Agent Platform requires a LiteLLM gateway URL to route mannequin calls. If you happen to shouldn’t have one operating, begin with the official LiteLLM quickstart at docs.litellm.ai. You’ll level the Agent Platform at this URL throughout configuration.
05 / 06
Native Quickstart
Clone the repo and run two instructions to get the total platform operating domestically. No cloud credentials wanted for native growth.
1
Clone the repository
Pull the repo from GitHub:
git clone https://github.com/BerriAI/litellm-agent-platform
cd litellm-agent-platform
2
Configure your .env file
Copy the instance env file and fill in your LiteLLM gateway URL and any secrets and techniques:
cp .env.instance .env
# Edit .env and set your LITELLM_GATEWAY_URL and different required values
3
Provision the native form cluster
This script is idempotent, that means protected to run a number of occasions. It provisions a sort cluster named agent-sbx, installs the agent-sandbox controller through helm, and hundreds the harness picture:
bin/kind-up.sh
4
Begin all companies
Boots Postgres, runs the schema migration as an init container, and begins the net server on port 3000 and the employee course of:
docker compose up
5
Open the dashboard
Navigate to http://localhost:3000 in your browser. You must see the LiteLLM Agent Platform dashboard with choices to create brokers, open classes, and monitor dwell standing.
Passing Secrets and techniques into Sandboxes
Any variable in .env prefixed with CONTAINER_ENV_ is routinely injected into each sandbox container with the prefix stripped. Instance: CONTAINER_ENV_GITHUB_TOKEN=ghp_… means the sandbox sees GITHUB_TOKEN=ghp_… That is the right option to cross credentials into agent classes.
06 / 06
Manufacturing Deployment
The really useful manufacturing setup separates the sandbox cluster (AWS EKS) from the net and employee processes (Render). The repo ships scripts and a Blueprint for each.
1
Provision the EKS sandbox cluster
The bin/eks-up.sh script provisions an AWS EKS cluster configured to run agent sandboxes. This replaces form because the sandbox backend. Requires AWS credentials in your surroundings:
bin/eks-up.sh
2
Deploy net and employee to Render
The repo features a Render Blueprint underneath deploy/render/ that deploys the net and employee companies to Render with one click on. See deploy/render/README.md for the Blueprint URL and required surroundings variables.
3
Use the Developer API straight (elective)
You’ll be able to work together with the platform programmatically through its REST API utilizing curl or any HTTP shopper. The complete API reference protecting how one can create an agent, open a session, ship a message, and skim the reply is at src/server/DEVELOPER.md within the repo.
# Instance: create an agent session through curl
curl -X POST http://localhost:3000/api/classes
-H "Content material-Kind: software/json"
-d '{"agent_id": "your-agent-id"}'
Structure Abstract for Manufacturing
AWS EKS runs the sandbox cluster the place agent classes execute in isolation. Render hosts the Subsequent.js net dashboard and the async employee. Postgres (managed or self-hosted) persists session state. The LiteLLM gateway runs individually and handles all mannequin API routing. These 4 elements talk over the community and may be scaled independently.
Platform is at the moment in alpha public preview. File points at github.com/BerriAI/litellm-agent-platform. Structure particulars at docs/k8s-backend.md within the repo.
1 / 6
Revealed by Marktechpost | AI/ML Information and Analysis for Builders and Engineers
Key Takeaways
BerriAI open-sourced LiteLLM Agent Platform, a self-hosted infrastructure layer for operating a number of AI brokers in manufacturing with per-team sandbox isolation and session continuity throughout pod restarts.
Sandboxes run on Kubernetes through the kubernetes-sigs/agent-sandbox CRD — domestically with form, in manufacturing with AWS EKS — no cloud credentials wanted to get began.
The platform sits on prime of the prevailing LiteLLM Gateway, which handles mannequin routing, value monitoring, and charge limiting throughout 100+ LLM suppliers in OpenAI format.
The quickstart is 2 instructions: bin/kind-up.sh provisions the sort cluster and installs the sandbox controller; docker compose up boots Postgres, net (:3000), and employee.
Launched underneath MIT license and at the moment in alpha public preview
Naveed Ahmad is a technology journalist and AI writer at ArticlesStock, covering artificial intelligence, machine learning, and emerging tech policy. Read his latest articles.