Kubernetes became the default control plane for running software by standardizing how containers are deployed, scaled, and operated across environments.

LLM-powered systems are now hitting a similar inflection point:

We can run LLMs almost anywhere, but we don’t yet have a universal way to manage them.

That’s why many engineers describe LlamaStack as “Kubernetes, but for LLMs.”
The analogy is useful, if we understand what layer it applies to and where it intentionally differs.

This post breaks down that analogy from a technical control-plane perspective.


What Kubernetes actually standardized

Kubernetes is not just “container management.” It is a control plane that introduced stable primitives and reconciliation loops:

  • Workload primitives: Pod, Deployment, Job
  • Service primitives: Service, Ingress
  • Control loops: desired state → continuously reconciled
  • Portability: cluster becomes the common substrate

Minimal mental model:

flowchart LR
  Dev[Developer] -->|kubectl / GitOps| API[Kubernetes API Server]
  API --> C[Controllers]
  C --> S[Scheduler]
  S --> N[Nodes]
  N --> P[Pods / Containers]
  C -->|reconcile desired state| N

Kubernetes doesn’t just run containers, it provides a uniform control surface for lifecycle, scaling, and placement.


The “pre-Kubernetes” state of LLM applications

Many LLM applications today resemble early cloud systems before Kubernetes:

  • Hard-coded model providers
  • Ad-hoc routing logic
  • No consistent pattern for:
    • model selection and fallback
    • tool invocation
    • policy enforcement
    • environment portability

Teams repeatedly rebuild the same orchestration glue.


What LlamaStack is trying to standardize

LlamaStack focuses on LLM-native primitives, not infrastructure:

  • Model access (local, hosted, multi-provider)
  • Routing & execution (choose a model per request)
  • Tools (standard tool-calling patterns)
  • Agents (reasoning systems composed of models + tools)

A useful reframe:

Kubernetes abstracts compute execution.
LlamaStack abstracts intelligence execution.


Kubernetes vs LlamaStack as control planes

flowchart TB
  subgraph K8s[Kubernetes Control Plane]
    KAPI[API Server]
    KCTRL[Controllers]
    KSCHED[Scheduler]
    KWORK[Pods / Deployments]
    KAPI --> KCTRL --> KWORK
    KAPI --> KSCHED --> KWORK
  end

  subgraph LS[LlamaStack Control Plane]
    LAPI[LLM API / SDK Surface]
    LROUTE[Routing / Selection Logic]
    LPROV[Provider Abstraction]
    LEXEC[Model Execution]
    LTOOL[Tool Invocation]
    LAGENT[Agent Runtime]
    LAPI --> LROUTE --> LPROV --> LEXEC
    LAPI --> LTOOL
    LAPI --> LAGENT
  end

Kubernetes manages where and how code runs.
LlamaStack aims to manage where and how intelligence is invoked.


Concept mapping (analogy, not identity)

flowchart LR
  subgraph K8sConcepts[Kubernetes Concepts]
    KP[Pod]
    KD[Deployment]
    KS[Service]
    KCRD[CRD]
    KNS[Namespace]
  end

  subgraph LlamaConcepts[LlamaStack Concepts]
    LM[Model]
    LR[Request Routing]
    LAPI[Unified APIs]
    LEXT[Provider / Tool Plugins]
    LPROJ[Project / App Scope]
  end

  KP --- LM
  KD --- LR
  KS --- LAPI
  KCRD --- LEXT
  KNS --- LPROJ

This mapping explains roles, not implementation:

  • Pod ↔ Model execution unit
  • Deployment ↔ routing policy
  • Service ↔ stable API surface
  • CRD ↔ extensibility mechanism
  • Namespace ↔ project-level scope

Where the analogy breaks (by design)

Kubernetes is:

  • Infrastructure-level
  • Declarative-first
  • Resource-oriented
  • Mature and conservative

LlamaStack is:

  • Application/platform-level
  • API-first
  • Intelligence-oriented
  • Rapidly evolving

So the accurate framing is:

LlamaStack is not Kubernetes today.
It is an early attempt at a Kubernetes-like control plane for LLM primitives.


The layered architecture that will likely win

Rather than replacement, the future points to complementary control planes:

flowchart TB
  U[Users / Applications]
  AICP[AI Control Plane: LlamaStack]
  AR[Agent Runtime Layer]
  ICP[Infrastructure Control Plane: Kubernetes]
  HW[Hardware: GPUs / CPUs / Accelerators]

  U --> AICP
  AICP --> AR
  AR --> ICP
  ICP --> HW
  • Kubernetes handles placement, scaling, networking
  • LlamaStack handles model abstraction and AI execution patterns
  • Agent frameworks focus on reasoning and task orchestration

Takeaway

A concise, technically accurate summary:

Kubernetes standardized how software runs.
LlamaStack is an early attempt to standardize how intelligence is invoked, routed, and composed.

Or even shorter:

Kubernetes manages containers. LlamaStack wants to manage intelligence.