Building a Log File

Combine framing + codec + batching + fsync to build a durable write-ahead log from scratch.

Write buffer (0/64B)

flushes when full

Log entries

No entries yet

Log

Operations appear here

Crash recovery

Record format

crc (1B)CRC-8 of payload
len (4B)payload length, u32 LE
seq (8B)sequence number, u64 LE
key_len+keylength-prefixed string
val_len+vallength-prefixed string

Key Insight — It All Comes Together

This log file combines every concept from this module:

Raw interface

read() / write() — just bytes at offsets

Framing

length-prefix + CRC header per record

Codec

struct → bytes (encode) / bytes → struct (recover)

Block I/O + batching

buffer until full, then single write()

Durability

fdatasync() after each flush

Crash recovery

CRC check on each record — skip corrupted

How This Connects to LSM Databases

What you just built is essentially RocksDB's WAL. The real WAL adds 32 KiB fixed blocks for even simpler crash recovery (seek to block boundary instead of scanning byte-by-byte), CRC-32c instead of CRC-8, and a WriteBatch header for grouping operations. The principles are identical.

And now you understand why the MemTable exists: it's the in-memory buffer waiting to be flushed (just like your write buffer). The SST file is the batched, sorted, indexed, block-aligned output. The WAL is exactly this log, but with fixed-size blocks for faster recovery.