Caching

The same idea, everywhere. From CPU registers to CDNs.

The Universal Pattern

Caching is not a "technique" - it's a fundamental response to physics. Whenever there's a speed gap between two layers, a cache emerges. CPU registers to L1. RAM to disk. Server to database. Browser to origin. The pattern is always the same.

Fast thing ←→ [CACHE] ←→ Slow thing

Caches Are Everywhere

CPU Registers
0.5ns
L1 Cache
1ns
L2 Cache
4ns
L3 Cache
40ns
RAM
100ns
Page Cache
~RAM
SSD
100μs
Network
1-100ms
From fastest to slowest: 100,000,000x difference

The Journey

First, the universal principles. Then, see them applied at every layer of the stack.

1

The Speed Gap

Why caches exist (it's physics, not preference)

From CPU registers (0.5ns) to disk (10ms) - a 100,000,000x difference. Caches exist because physics created speed gaps.

Key Insight: Every layer boundary is a potential cache location. The speed gap creates the need.
2

The Caching Contract

Coming Soon

What every cache must promise

A cache pretends to BE the slow thing, but faster. It bets you'll ask for the same thing again soon.

Key Insight: A cache is a bet: 'You'll probably ask for this again soon.'
3

Eviction - The Hard Choice

Coming Soon

What to forget when memory is full

LRU, LFU, FIFO, ARC - every policy is a bet about future access patterns. There is no perfect answer.

Key Insight: There is no perfect eviction policy. Every policy is a bet about future access patterns.
4

Invalidation - The Hardest Problem

Coming Soon

"There are only two hard things in CS..."

TTL, write-through, event-based - how does the cache know its data is stale? The unavoidable consistency vs performance tradeoff.

Key Insight: You cannot have instant invalidation AND maximum performance. Choose your tradeoff explicitly.
5Hardware

CPU Caches

Coming Soon

The cache you never control (but must understand)

L1/L2/L3 are invisible to your code, but determine whether your code is 10x fast or 10x slow. Cache lines, spatial locality, and why sequential access wins.

Key Insight: The CPU cache is an LRU cache for memory with 64-byte cache lines. Access memory sequentially.
6Operating System

The Page Cache

Coming Soon

The OS's gift you didn't know you were using

Every file read goes through the page cache. read() doesn't read from disk - it copies from RAM. The OS already caches your I/O.

Key Insight: When you add application-level caching, you're caching a cache.
7Application

Application Caches

Coming Soon

When the OS cache isn't enough (RocksDB Block Cache)

Page cache stores compressed data. Every read = decompress. Application caches store the TRANSFORMED version - decompressed, parsed, computed.

Key Insight: Application caches add value when they cache a TRANSFORMED version of data.
8Distributed System

Distributed Caches

Coming Soon

When one machine's memory isn't enough (Redis, Memcached)

100 app servers each caching the same data = waste. Shared cache tier solves this, but now you fight the network.

Key Insight: Distributed caches trade latency for shared capacity. Same patterns, now with network.
9Global

CDNs and Edge Caches

Coming Soon

Caching across the planet

Server in Virginia, user in Tokyo = 200ms (physics!). Copy the data to the edge. Same eviction/invalidation tradeoffs, global scale.

Key Insight: CDNs are geographically distributed LRU caches with HTTP as the protocol.
10

The Unifying Patterns

Coming Soon

What every cache has in common

Latency tradeoff, consistency spectrum, capacity curves, cold start, observability. Once you understand these, you understand every cache.

Key Insight: The implementation differs, the principles don't.

Connections to Other Chapters

File I/O

Page cache is the OS's file cache. Now you understand why fsync() matters.

SST Files

Block cache stores decompressed SST blocks. Bloom filters avoid cache misses entirely.

LSM Compaction

Compaction changes files. How does block cache handle invalidation?