Blog

2025.07.24

Engineering

Cache System for Container Images in Kubernetes

Hidehito Yabuuchi

This is a machine-translated version of Kubernetes 向けコンテナイメージのキャッシュシステム using PLaMo Translation.

Introduction

Preferred Networks (PFN) develops and operates a machine learning infrastructure using Kubernetes. This article introduces CIRC (pronounced “Sark”), the cache system we developed to accelerate container image pulls and reduce network traffic in this infrastructure. This content is based on our presentation at the inaugural KubeCon + CloudNativeCon Japan 2025 conference held in Japan in June 2025. The session New Cache Hierarchy for Container Images and OCI Artifacts in Kubernetes Clusters Using Containerd can be viewed along with the presentation materials and session recording via the provided links.

Background: The Need for a Container Image Cache System

PFN’s Kubernetes clusters are primarily deployed on-premises, while we use managed cloud services for container image registries. This approach was adopted to enable unified user account management and maintain interoperability across various environments, including the clusters themselves.

The clusters and container registries are separated, with limited network bandwidth between them. Moreover, the images used in the clusters tend to be large in size, often containing machine learning libraries and accelerator runtimes, with some exceeding 20 GB even in compressed form.

Given these circumstances, we faced the following challenges:

  • Prolonged image pull times
    • Since image pulling begins only after pods are assigned to nodes, reducing this duration would improve resource utilization efficiency in the cluster.
  • Significant network usage from container image transfers
    • In particular, egress traffic from the registry is subject to usage-based billing, making it essential to reduce this volume.

Overview of CIRC

To address these challenges, we developed CIRC. Including its predecessor software, CIRC has been operational for over five years.

CIRC introduces a new cache hierarchy specifically for container images. This additional caching layer is shared across the entire cluster, making cached images available for pulls from all nodes. This significantly improves cache utilization compared to plain Kubernetes configurations that rely solely on node-specific caching.

At PFN’s clusters, we use containerd as the container runtime. When containerd requests to pull an image, that request is routed to CIRC. If the image is already cached, CIRC immediately returns it; if not, it retrieves the image from the origin registry, stores it in the cache, and then returns it to containerd. Consequently, the same image is fetched only once from the origin unless the cache becomes invalidated, with most image pulls completing within the high-speed intra-cluster network.

For cache storage, we use SCS, a distributed cache service we developed. Additional information about SCS can be found in the linked article and was presented at KubeCon NA 2024.

CIRC adheres to the OCI Distribution Specification, an industry standard protocol for container image distribution established by the Open Container Initiative (OCI). Since CIRC implements this standard, it maintains compatibility with a wide range of origin registries and different versions of containerd.

Key Features of CIRC

CIRC possesses four primary characteristics, which we will examine in detail below.

Transparency

Transparency means that users can benefit from CIRC’s functionality without needing to be aware of its presence. Specifically, no modifications to Kubernetes manifests are required, and CIRC operates seamlessly with any registry, including popular platforms like Docker Hub and Quay.io.

To achieve this, CIRC leverages containerd’s Registry Configuration feature. This feature allows redirecting image pull requests from origin registries to alternative servers. By configuring it to target any origin registry, all requests can be routed through CIRC.

After receiving requests from containerd, CIRC must fetch images from the origin registry if the images are not already cached. How does CIRC identify which origin registry to use? While we could statically configure this in CIRC, this approach would limit compatibility with arbitrary registries.

This challenge can be solved using containerd’s feature. The Registry Configuration feature in containerd sets the origin domain in ns query parameter when sending HTTP requests to the configured servers—for example, ns=quay.io. CIRC can then parse this ns query parameter to determine which registry to fetch images from.

By utilizing containerd’s built-in functionality, we’ve successfully implemented CIRC’s transparency feature. It’s worth noting that the ns query parameter is currently under discussion for inclusion in the OCI Distribution Specification, with containerd being a pioneer in implementing this feature. We anticipate that its inclusion in the specification will encourage other container runtimes to adopt it as well.

Multi-Tenancy

At PFN, we share clusters among various teams and projects. Since CIRC’s cache is shared across the entire cluster, ensuring secure cache sharing in multi-tenant environments becomes critical. If unauthorized access to images belonging to different tenants were possible through the cache, it could lead to serious security issues.

CIRC achieves secure cache sharing through authentication and authorization. When receiving image pull requests, CIRC first verifies whether the user has permission to access the image. Specifically, it sends an HTTP HEAD request to the origin registry to query authorization status. If access is permitted, CIRC returns the image; otherwise, it responds with a 404 Not Found error. By implementing image-level authentication and authorization, CIRC enables efficient cache sharing across the entire cluster while maintaining strict security for images.

Preheating

While CIRC’s implementation has significantly accelerated image pulls, this only applies when images are already cached. When using a new image for the first time, it must first be fetched from the origin registry. But what if we could eliminate even this initial fetch operation—allowing CIRC to accelerate pulls from the first request?

Preheating is a feature that enables this optimization by supporting image push operations through CIRC. When building images, users can directly push them to CIRC rather than to the origin registry. CIRC stores received images in its cache while simultaneously uploading them to the origin registry for persistence. Since the image becomes available in the cache, users can subsequently pull it from CIRC quickly—even during the first request.

Regarding image push operations, CIRC adheres to the OCI Distribution Specification, allowing images from a wide range of clients to be pushed to it.

Compatibility with OCI Artifacts

Within the Kubernetes and container runtime communities, development is underway for OCI VolumeSource feature, which enables pods to mount OCI artifacts.

OCI artifacts store arbitrary content in OCI image format, and the OCI VolumeSource feature includes use cases such as distributing AI models. Particularly for AI models, these artifacts can sometimes reach massive sizes of several tens of GB, making the ability to quickly retrieve large OCI artifacts crucial.

CIRC also supports pulling OCI artifacts. This is because CIRC complies with the OCI Distribution Specification, with OCI artifacts stored in the same format as standard OCI images and distributed through container registries. By utilizing CIRC, we expect to be able to quickly pull even large OCI artifacts.

Operational Challenges and Solutions

During actual deployment of CIRC, we encountered several operational challenges. Below we outline how we addressed these issues.

Bootstrapping Problem

Since CIRC itself operates as a Kubernetes Deployment within a cluster, there may be periods when CIRC becomes temporarily unavailable due to cluster maintenance or other reasons. During such downtime, CIRC requires its own container image for startup, but the very CIRC instance needed to handle this request is currently unavailable – creating a bootstrapping problem. Fortunately, containerd’s Registry Configuration feature provides a fallback mechanism: if request forwarding fails, it automatically reverts to the origin registry. As a result, image pulls continue even during CIRC downtime, and eventually CIRC will restart and make the cached images available.

However, we observed that containerd’s fallback mechanism takes too long to activate. This caused an issue where while image pulls could proceed during CIRC downtime, the progress was extremely slow.

Upon investigation, we discovered that the timeout duration for containerd’s fallback mechanism is hardcoded at 30 seconds per blob. We therefore proposed an enhancement to containerd to allow configuration of this timeout value.

This enhancement has already been included in containerd 2.1.0 and later versions. By using this version at PFN and reducing the timeout duration, image pulls now proceed at a practical speed even before CIRC is fully operational.

Thundering Herd Problem

When performing operations like rolling out a DaemonSet, multiple concurrent requests may simultaneously attempt to pull the same image. These requests would be handled by CIRC, but if each request were processed independently, it would result in multiple simultaneous requests being sent to the origin registry – causing the well-known thundering herd problem. This would effectively render the caching mechanism ineffective.

To address this issue, CIRC employs singleflight concurrency pattern. In singleflight, when multiple identical requests arrive simultaneously, only one is processed while the others are queued to wait for the result. This ensures that image retrieval from the origin occurs only once, with other requests being served quickly by returning results from the cache.

For implementing singleflight, we use Kubernetes’s Lease object for leader election. This mechanism ensures that only one CIRC goroutine per image retrieval operation exists across the cluster, achieved by goroutines competing for locks on a Lease object.

Real-World Outcome

This section demonstrates the actual performance improvements achieved in PFN’s production clusters after implementing CIRC.

First, pod startup time—defined as the duration from node allocation until the container begins execution—was reduced by approximately 20%. This improvement was primarily due to CIRC’s enhanced image pull performance. Thanks to its transparency, users were able to benefit from this speed boost without needing to be aware of CIRC’s presence.

Additionally, data transfer volume from the origin registry to the cluster was reduced by about 23 TB over the course of one week. CIRC’s cache hit rate exceeded 98%, with contributions from the preheating functionality. While current savings amount to 23 TB, we anticipate even greater effectiveness when leveraging OCI VolumeSource feature.

Note that these performance metrics can vary significantly depending on environment and workload characteristics.

Summary

We have introduced CIRC, a container image caching system designed for Kubernetes clusters. The implementation of CIRC has successfully reduced image pull times and decreased network data transfer volumes in environments where clusters and container registries are separated.

The development and achievements of CIRC are built upon the foundation provided by containerd’s features and the OCI standardization. Without the efforts of these communities, CIRC would not have been possible. PFN will continue to actively contribute to and collaborate with the container and Kubernetes communities.

  • Twitter
  • Facebook