The Disk Doesn't Care
You ask for 1 byte. The disk reads 4 KiB. The hardware speaks in blocks, not bytes.
File — 32 KiB (8 blocks × 4 KiB)
reading byte at offset 5000
Click a block to target a byte inside it. The highlighted dot = offset 5000.
What actually happens
Why 4 KiB blocks?
HDD (spinning)
Sector = 512B or 4KB. High seek latency (10ms). Reading 1 extra sector costs nothing in seek time, only bandwidth.
SSD / NVMe
Page = 4–16 KB. Flash reads entire page even for 1 byte. erase block = 128–512 KB minimum write unit.
Filesystem (ext4)
Default block size = 4 KiB. Aligns with both HDD sectors and SSD pages. Page cache tracks at 4 KiB granularity.
// kernel/Documentation/filesystems block_size = 4096; // bytes // "The smallest unit that the // filesystem can allocate"
Key Insight
Random byte access is a convenient lie. The OS presents you with a byte-addressable abstraction, but underneath, every read transfers at least one 4 KiB block from disk to the page cache. If you read 1 byte, you've paid the cost of reading 4096 bytes.
This is why SST files pack records into blocks — you're paying for 4 KiB whether you like it or not, so you might as well fill the block. It's also why sequential reads are so much faster than random: you're reading blocks that contain data you'll actually use.
But there's another layer of indirection above the disk: the page cache. When you call write(), the data doesn't go to disk immediately. It goes to RAM. And if the power cuts out before the OS flushes it...