Topic — 3 essays
Data Engineering
The mechanics of moving data safely: idempotent pipelines, change data capture, and the batch-versus-streaming decision — built for re-runs, retries, and 3 a.m. failures.
Field Notes
Batch vs Streaming: How to Actually Decide
Batch vs streaming isn't legacy vs modern. The real question: what latency does the decision consuming the data actually require? Default to batch; promote pipelines...
02Field Notes
What Is Change Data Capture (CDC), and When Do You Need It?
Change data capture identifies inserts, updates, and deletes in a source database and delivers them downstream. Here's how log-based, trigger-based, and query-based ...
03Field Notes
How to Make a Data Pipeline Idempotent
An idempotent data pipeline produces the same result whether it runs once or five times. Here are the concrete patterns — partition overwrite, merge on keys, delete-...