 and left behind a graveyard of shims."
---

K8s needed Docker. Then it didn't. The breakup took five years and left behind a graveyard of shims.

---

## Before the divorce: why they married

In 2014, "container" meant Docker. There was nothing else. K8s launched and hardcoded Docker API calls into kubelet. No abstraction. No interface. `kubelet` knew how to talk to Docker the same way your code knows how to talk to `console.log`. Directly, with no layer in between.

This worked. Docker was the only runtime. Why build an abstraction for one implementation?

---

## The affair: rkt shows up

In 2015, CoreOS released rkt (pronounced "rocket"). A competing container runtime. Different architecture. And it wanted to plug into K8s.

K8s had a problem. Supporting rkt meant writing a second set of hardcoded calls in kubelet. Then if a third runtime appeared, a third set. Every new runtime meant more hardcoded integration code in kubelet.

So K8s did what any engineer does when two implementations exist: define an interface. They called it **CRI**. Container Runtime Interface. A gRPC API. kubelet would speak CRI. Any runtime that understood CRI could plug in. Clean.

rkt implemented CRI. containerd implemented CRI. Docker didn't.

rkt forced CRI into existence, then lost the war. CoreOS got acquired by Red Hat in 2018. Red Hat already had CRI-O. CNCF archived rkt in 2020.

---

## Docker never implemented CRI

Docker had its own orchestrator: **Swarm**. Same job as K8s, simpler interface. `docker swarm init` and you had a cluster. `docker service scale web=5` and you had five replicas. No YAML walls, no CRD, no etcd.

Docker never implemented CRI. Why would they? CRI was K8s's standard, and Docker had Swarm. Implementing CRI meant acknowledging K8s as the platform and Swarm as just another runtime. That's not a technical decision. That's a pride decision.

---

## K8s blinks first: dockershim

K8s couldn't drop Docker. In 2016, Docker was the only runtime most clusters knew. Dropping support meant losing the entire user base.

So K8s caved. It wrote the translator itself: `dockershim`. A shim that converted CRI calls into Docker API calls. It lived inside the K8s codebase. K8s maintained it. K8s paid the cost. Every CRI change meant updating dockershim too. Docker didn't write a single line of this. K8s did all the work to keep Docker compatible with a standard Docker refused to implement.

> **Era 0** (2014–2016): Hardcoded Docker calls in kubelet. No interface. No cost.
>
> **Era 1** (2016+): CRI exists. Docker refuses. K8s writes dockershim to accommodate. (100% K8s's work.)
>
> `kubelet → dockershim → Docker Engine → containerd → runc → container`
>
> Four hops to reach what matters. K8s doesn't need the Docker Engine layer. But Docker won't let go.

The irony: Docker Engine internally used containerd to manage containers and runc to create them. K8s was talking to Docker, who was talking to containerd. The middleman added latency for zero value.

K8s defined a standard. The biggest player refused to follow it. And K8s, not Docker, wrote the compatibility layer. The one who set the standard ended up doing the grunt work for the one who ignored it. That shim haunted the codebase for six years.

---

## The breakup, in four steps

### Step 1: K8s + containerd community bypass Docker (containerd 1.0, 2017)

Someone finally asked the obvious question: "Why are we talking to Docker if Docker is just talking to containerd? Why not skip the middleman?"

K8s SIG-node and containerd maintainers built `CRI-containerd` together. A standalone daemon that translated CRI calls directly to containerd. Docker had nothing to do with this.

> **Era 2**: Docker bypassed. But a new daemon added. (K8s + containerd joint effort.)
>
> `kubelet → CRI-containerd (standalone daemon) → containerd → runc → container`

Docker Engine gone from the path. Progress. But CRI-containerd was yet another process to deploy and babysit.

### Step 2: containerd absorbs the translator (containerd 1.1, 2018)

The containerd community built CRI support as a built-in plugin. CRI-containerd the standalone daemon disappeared into containerd itself. K8s didn't need to do anything here. containerd made itself directly usable.

> **Era 3**: One call. No middlemen. (containerd community's initiative.)
>
> `kubelet → containerd (built-in CRI plugin) → runc → container`

This is what most clusters run today. Three eras and two throwaway components (dockershim, CRI-containerd) to arrive at "just talk to containerd directly."

### Step 3: Red Hat starts from scratch (CRI-O)

While everyone was busy removing Docker from the path, Red Hat built CRI-O from scratch. Purpose-built for K8s. No Docker legacy. No image build. No extras. Speaks CRI on top, speaks OCI on bottom. Does exactly what K8s needs and nothing more.

> Alternative path: `kubelet → CRI-O → runc → container`

OpenShift uses CRI-O by default. containerd can do more things. CRI-O's pitch is that it shouldn't have to.

### Step 4: K8s deletes dockershim (K8s 1.24, 2022)

K8s finally stopped accommodating. Six years of maintaining a translator for someone else's refusal. Deleted. containerd became the default runtime. Running containers worked fine. But something else broke.

**The collateral damage: docker.sock**

Before 1.24, every Node ran Docker Engine. Docker Engine exposed `/var/run/docker.sock`. Many CI pipelines (Jenkins, GitLab Runner, Tekton) ran as Pods inside the cluster and mounted that socket to build images on the Node's Docker.

Before 1.24, Node had two processes:
- `Docker Engine` → `/var/run/docker.sock` ← CI Pods mounted this
- `containerd` → `/var/run/containerd.sock` ← Docker used this internally

After 1.24, Node has one:
- `containerd` → `/var/run/containerd.sock` ← kubelet talks directly to this
- Docker Engine → gone
- docker.sock → gone

containerd has its own socket. But `docker build` doesn't speak containerd's API. It only talks to Docker's socket. Different program, different protocol.

> `docker build` → looks for `/var/run/docker.sock` → file doesn't exist → fails

The problem was never "no socket." It was "Docker is gone from this Node, and docker CLI can't talk to anything else."

**Three ways out:**

1. **Install cri-dockerd** → Docker Engine comes back → docker.sock returns → pipeline unchanged
2. **Switch to kaniko** → builds images in user space → no daemon, no socket, no privileged
3. **Switch to buildah** → same idea as kaniko, Red Hat's version

kaniko and buildah gained adoption as daemon-less alternatives. cri-dockerd kept the lights on for teams that couldn't rewrite their pipelines overnight. But the call chain tells you why it's a dead end:

> `kubelet → cri-dockerd → Docker Engine → containerd → runc`

Three hops to reach containerd. The mainstream path reaches it in one.

---

## The three layers

Container management splits into three layers. Each layer has an interface. Each interface has multiple implementations.

```
High-level management     High-level runtime      Low-level runtime
(who gives orders)        (who manages lifecycle)  (who creates containers)
                │                    │                       │
                │       CRI          │          OCI          │
                │    (gRPC API)      │      (spec + API)     │
                ▼                    ▼                       ▼
Kubernetes ─────────→ containerd ──────────→ runc ─────→ container
crictl          ────→ CRI-O    ──────────→ kata-runtime → VM + container
docker          ────→ Docker   ──────────→ gVisor (runsc) → sandbox + container
podman          ────→ libpod
```

Every name in the left column:

- **Kubernetes / kubelet**. talks CRI to whatever runtime is configured.
- **crictl**. CLI debugging tool for CRI runtimes. Think of it as "`docker ps` for containerd." Speaks CRI directly. `crictl ps` shows K8s containers that `docker ps` can't see. Ships with K8s.
- **docker**. the Docker CLI. Talks to Docker Engine (dockerd) over Docker's own API. Nothing to do with CRI.
- **podman**. Red Hat's Docker replacement for local development. Same CLI (`podman build`, `podman run`). **Daemonless**. each command runs its own process. Uses **libpod** as its container library. Not a K8s runtime. Not CRI. A developer tool for running containers without Docker.

### CRI: the left interface

Container Runtime Interface. A gRPC API that kubelet speaks. Any high-level runtime that implements CRI can plug into K8s.

CRI only asks the minimum. It doesn't care what's behind the interface: Linux namespace, VM, sandbox. As long as you can "create an isolated environment, run a process, report status," you're a valid CRI implementation. That's why virtlet could run VMs pretending to be Pods. kubelet never asked "are you a container or a VM?" because CRI doesn't have that question.

The API splits into two gRPC services:

**RuntimeService** (Pod and container lifecycle):
- `RunPodSandbox` → create the sandbox (network namespace, etc.)
- `StopPodSandbox` → stop it
- `CreateContainer` → create a container inside a sandbox
- `StartContainer` → start it
- `StopContainer` → stop it
- `ListContainers` → what's running?
- `ContainerStatus` → is it healthy?
- `ExecSync` → run a command inside a container
- `Attach` → attach stdin/stdout
- `PortForward` → forward a port

**ImageService** (image management):
- `PullImage` → pull from registry
- `ListImages` → what's cached?
- `RemoveImage` → delete from cache
- `ImageStatus` → size, digest, etc.

Full protobuf definition: [kubernetes/cri-api/api.proto](https://github.com/kubernetes/cri-api/blob/v0.33.1/pkg/apis/runtime/v1/api.proto)

kubelet doesn't care if it's talking to containerd or CRI-O. It calls `RunPodSandbox`, `CreateContainer`, `StartContainer`. The runtime handles the rest.

### OCI: the right interface

Open Container Initiative. Two specs:

**Image spec**: how container images are packaged. Layers, manifests, digests. Docker invented this format. Then donated it to OCI. `docker pull nginx` and `ctr image pull nginx` get the same bytes. Docker's manifest v2 and OCI's manifest have minor structural differences. Registries and runtimes handle both transparently.

**Runtime spec**: how containers are created. A standard directory layout: `config.json` (namespaces, cgroups, mounts, env vars) plus a `rootfs` (the filesystem). Any low-level runtime that reads this format can run the container.

containerd doesn't care if it calls runc or kata-runtime. It passes an OCI bundle. The low-level runtime reads `config.json`, sets up isolation, starts the process.

---

## Every name explained

### containerd

High-level runtime. Pulls images, manages snapshots (filesystem layers), calls runc to create containers, monitors their lifecycle. Doesn't create containers itself. Delegates to a low-level runtime.

containerd started inside Docker. Docker's monolith got too big, so Docker split itself into pieces. containerd was one of them. In 2017 Docker donated it to CNCF (the same foundation that hosts K8s). In 2019 it graduated as an independent project. Now it belongs to nobody. Docker uses it. K8s uses it. Neither owns it.

### runc

Low-level runtime. The default. Takes an OCI bundle, sets up Linux namespaces, cgroups, chroot, then exec's the container process. Finishes in milliseconds. Then exits.

That's the key: **runc exits after creating the container.** It's a one-shot tool. Start the container, leave. Someone else needs to babysit.

### containerd-shim

runc exits after creating the container. shim stays as the container's parent process. containerd can crash, restart, upgrade. shim keeps the container alive. Uses a double fork trick to reparent to PID 1 (systemd), cutting the process tree link to containerd.

v2 (`containerd-shim-runc-v2`) changed from per-container to per-Pod, gRPC to ttrpc, and added pluggable runtime naming.

Deep dive: [Inside containerd: How shim and runc Actually Work](/v2/containerd-shim-and-runc)

### CRI-O

High-level runtime. Red Hat's answer to containerd. Built specifically for K8s. No Docker legacy, no image build, no extras. Speaks CRI, calls OCI runtimes. Versions match K8s versions (CRI-O 1.28 for K8s 1.28).

### dockershim (dead)

Translator that lived in K8s source code. Converted CRI calls to Docker API calls. Removed in K8s 1.24. If you still need Docker as a runtime, use `cri-dockerd` instead.

### cri-dockerd (the afterlife of dockershim)

Mirantis pulled dockershim's code out of K8s's repo, renamed it `cri-dockerd`, and maintained it independently. Same translation logic. Different address. See [K8s 1.24 section](#k8s-124-2022--dockershim-deleted) for why it exists and why most teams skip it.

### The full routing picture

Three paths into runc. Every one ends at the same place.

```
                    ┌─ containerd (CRI plugin) → runc → container    ← mainstream
                    │
kubelet ── CRI ─────┤─ CRI-O → runc → container                     ← OpenShift
                    │
                    └─ cri-dockerd → Docker → containerd → runc      ← legacy detour
```

---

## Low-level runtime alternatives

Containers are like apartments in a building. Every apartment (container) shares the same foundation (host kernel). runc builds apartments this way. Fast to move in, cheap to run. But if someone cracks the foundation, everyone's apartment is compromised.

The question: how do you give tenants better isolation?

### Option 1: Lock the doors (seccomp, AppArmor)

Don't change the building. Just restrict what tenants can do. "You can use the kitchen and bathroom, but you can't touch the electrical panel or the gas pipes."

runc already supports this. containerd ships with default restrictions out of the box. Lightweight, almost no performance cost.

Problem: the foundation is still shared. A crack in the foundation bypasses every locked door. You're controlling what tenants do, but you can't control what the building itself does when it breaks.

### Option 2: Give everyone their own house (traditional VM)

Separate foundation for each tenant. Total isolation. But now each "apartment" takes seconds to build, eats GBs of memory. You had 50 containers on one server. Now you fit 5 VMs.

### Option 3: gVisor (fake floor)

Google's approach. Build a fake floor between the apartment and the foundation. The tenant (app) thinks it's standing on the real foundation (Linux kernel). It's actually standing on gVisor's imitation floor (a user-space kernel called Sentry). Sentry intercepts every request to the real foundation and decides what to pass through.

```
Normal container:  app → host kernel (direct contact)
gVisor container:  app → Sentry (fake floor) → host kernel (filtered)
```

Safer. Slower (every request goes through an extra layer). Used for running untrusted code, like a cloud function platform where you don't know what users will upload.

### Option 4: Kata (tiny house, real foundation)

Build a real, tiny house for each tenant. Each house has its own foundation (a lightweight VM with its own kernel). A crack in one foundation doesn't affect the others.

```
Normal container:  app → shared host kernel
Kata container:    app → guest kernel (own VM) → host kernel
```

Safest. Heaviest. Requires hardware virtualization support. Used when you absolutely cannot share a kernel (multi-tenant clouds, regulated workloads).

### All of them plug into the same slot

gVisor and Kata are both OCI-compatible. Tell containerd or CRI-O "use gVisor instead of runc for this workload" in one config line. The app inside the container doesn't know the difference. Same image, same deployment YAML, different isolation level underneath.

---

## containerd namespace isolation

containerd isolates K8s workloads from Docker workloads using namespaces (not K8s namespaces, containerd's own concept).

```bash
$ sudo ctr namespaces ls
NAME    LABELS
k8s.io          # kubelet's containers
moby            # Docker's containers
```

- `kubelet → containerd → containerd-shim (k8s.io namespace) → runc → container`
- `docker → containerd → containerd-shim (moby namespace) → runc → container`

Two clients sharing one containerd. They can't see each other's containers. `docker ps` won't show K8s pods. `crictl ps` won't show Docker containers. Same runtime, isolated worlds.

---

## The timeline

| Year | Event |
|---|---|
| 2014 | K8s launches. Docker hardcoded in kubelet. No interface |
| 2015 | CoreOS releases rkt. Wants to plug into K8s. Can't. Docker is hardcoded |
| 2016 | K8s defines CRI. rkt and containerd implement it. Docker doesn't. K8s writes dockershim |
| 2017 | containerd 1.0. Docker donates it to CNCF. CRI-containerd runs as standalone daemon |
| 2018 | containerd 1.1. CRI plugin built in. Clean kubelet → containerd path |
| 2019 | CRI-O matures. Red Hat ships it with OpenShift. Docker adds K8s support to Docker Desktop. Swarm has lost |
| 2020 | K8s announces dockershim deprecation. Panic ensues |
| 2022 | K8s 1.24. dockershim removed. docker.sock disappears from Nodes. CI pipelines break. kaniko and buildah gain adoption |

Eight years from "Docker is everything" to "Docker is optional." The tools changed. The lesson didn't: interfaces (CRI, OCI) outlive implementations. That's why K8s defined them.

---

## Where Docker ended up

Docker never implemented CRI. Not in 2016. Not in 2018. Not now. Docker Engine still speaks only Docker's own API.

The industry routed around Docker. containerd got CRI support and became the mainstream K8s runtime. Docker Engine became irrelevant on production clusters.

So Docker pivoted. Instead of fighting the runtime war it had already lost, Docker focused on the other end of the pipeline: **building images, not running them**.

| What Docker kept | What Docker lost |
|---|---|
| Docker Desktop (developer local environment) | Runtime on K8s clusters |
| Docker Hub (image registry) | Swarm (orchestration) |
| `docker build` (image construction) | Docker Engine as K8s default |
| Docker Compose (local multi-container dev) | dockershim (K8s deleted it, Mirantis picked up the pieces) |

The current state of the industry:

```
Developer laptop:
  docker build → build image
  docker push → push to registry

Production K8s cluster:
  kubelet → containerd → runc → container
  (No Docker anywhere in this path.)
```

Build with Docker. Run with containerd. Docker is everywhere on laptops. Docker is nowhere on clusters.

### Why Docker donated its core

Docker donated containerd (2017) and runc (2015) to neutral foundations. That sounds like giving away the crown jewels. The logic:

If containerd stays inside Docker Engine, Google and Red Hat build their own runtime. The ecosystem splits. Docker's technology becomes one of many.

If containerd goes to CNCF, everyone uses the same runtime. Docker's technology becomes the industry standard. Docker Desktop and Docker Hub become more valuable because they sit on top of a universal foundation.

Docker bet that **donating the technology would strengthen the brand**. containerd became K8s's default runtime. Docker's tech won. Docker Engine didn't need to.

---

## References

- [CRI API protobuf definition (RuntimeService, ImageService)](https://github.com/kubernetes/cri-api/blob/v0.33.1/pkg/apis/runtime/v1/api.proto)
- [K8s official: Container Runtime Interface (CRI)](https://kubernetes.io/docs/concepts/architecture/cri/)
- [K8s blog: Introducing CRI (2016, architecture diagrams)](https://kubernetes.io/blog/2016/12/container-runtime-interface-cri-in-kubernetes/)
- [K8s blog: Dockershim Historical Context (2022, full story)](https://kubernetes.io/blog/2022/05/03/dockershim-historical-context/)
- [K8s blog: Don't Panic — Kubernetes and Docker (2020)](https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/)
- [K8s blog: Dockershim Removal FAQ (2022)](https://kubernetes.io/blog/2022/02/17/dockershim-faq/)
- [Alibaba Cloud: Container Runtime evolution (before/after architecture diagrams)](https://www.alibabacloud.com/blog/a-discussion-on-container-runtime---starting-with-dockershim-being-deleted-by-kubernetes_600118)
- [K8s official: Container Runtimes setup guide](https://kubernetes.io/docs/setup/production-environment/container-runtimes/)
- [Containerd组件 - containerd-shim-runc-v2作用 (cnblogs)](https://www.cnblogs.com/zhangmingcheng/p/17524721.html)
