πŸ“š Building a 25-Year Backfill Pipeline for the National Library of Korea API

How I Designed a Reliable, Auto-Resuming ETL to Collect Decades of Book Data β€” Without Airflow 1. Why I Built This The National Library of Korea (NLK) provides a public API called Seoji β€” a bibliographic catalog of all registered books in Korea. I wanted to collect the entire dataset, from January 2000 to December 2024, and store it in my PostgreSQL database (Supabase). It sounded simple at first β€” just a loop over API pages. But in practice, I had to solve: ...

October 22, 2025

How PostgreSQL Surprises You: Booleans, Text I/O, and ETL Gotchas

PostgreSQL is a powerful, standards-compliant database β€” but it has its quirks. One of those is how it handles boolean values, especially when exporting data in text format. 🧠 PostgreSQL Boolean Behavior: It’s Not What You Think Internally, PostgreSQL stores boolean values efficiently using just 1 bit β€” as you’d expect. But when you convert those values to text, say in a query or an export via COPY, things look… different: ...

June 10, 2025