Blog - Guru, Eerla | Data Engineering

April 22, 2026 in

If You're Not Letting AI Write Code, You're Already Behind - But Don't Hand It the Keys

April 22, 2026 in ai software-engineering development

AI is accelerating development but it's also changing how knowledge is created, shared, and reproduced.

Tags: ai llm software-development software-engineering

Not all Data Pipelines Fail — They Succeed with Wrong Data

April 11, 2026 in data-engineering data-pipelines cicd

Most data pipelines don't fail loudly. They fail quietly — and keep running. That's the real problem.

Tags: data-engineering data-pipelines data-quality cicd

If You Think You Know Python, These Will Prove You Wrong

April 11, 2026 in python programming gotchas

Most of us get comfortable because our code works, not because we fully understand why. And that illusion breaks the moment you hit edge cases that don't behave the way you expect.

Tags: python advanced-programming common-mistakes

You're Not Competing with AI - You're Competing with Engineers Who Use It

March 25, 2026 in ai-engineering

I’m not saying this after a weekend of trying AI tools. I’m saying this after 2 years of using Cursor consistently - while working a demanding full-time job. And I’ll be direct: The way most engineers are still writing code today is already outdated.

Tags: ai cursor engineering productivity tools

Airflow Works Best When It Does Less

March 23, 2026 in airflow orchestration data-engineering

If your Airflow tasks are doing real computation, your system is already mis designed.

Tags: airflow orchestration data-engineering best-practices

I Dug Into Delta Lake's Transaction Log - This Is How ACID Actually Works on S3

March 22, 2026 in

I used to treat object stores like what they are: cheap, durable, and completely unreliable for transactional work. Great for dumping data. Terrible for updates, deletes, or anything resembling correctness.

Tags: delta-lake acid s3 data-lake transaction-log

The Real Cost of Data Observability

February 15, 2024 in data-engineering

This is where your article from Medium will go. Just copy and paste the full content from https://medium.com/@think-data

Tags: observability data-quality monitoring cost

dbt Changed Data Engineering Forever

February 10, 2024 in data-engineering

This is where your article from Medium will go. Just copy and paste the full content from https://medium.com/@think-data

Tags: dbt transformation sql data-warehouse

You Don't Need Kafka for Everything

February 05, 2024 in data-engineering

This is where your article from Medium will go. Just copy and paste the full content from https://medium.com/@think-data

Tags: kafka messaging architecture system-design

Airflow is Not a Data Pipeline Tool

February 01, 2024 in data-engineering

If your Airflow tasks are doing real computation, your system is already mis-designed.

Tags: airflow orchestration workflow data-pipeline

Batch > Real-Time (Most of the Time)

January 25, 2024 in data-engineering

This is where your article from Medium will go. Just copy and paste the full content from https://medium.com/@think-data

Tags: batch-processing real-time streaming architecture

Your Data Lake is Probably a Swamp

January 20, 2024 in data-engineering

This is where your article from Medium will go. Just copy and paste the full content from https://medium.com/@think-data

Tags: data-lake data-governance architecture quality

Spark on GCP is Overkill - Use BigQuery Instead

January 15, 2024 in data-engineering

If you’re running a persistent Spark footprint on GCP for analytics and ELT, you should at least ask whether you still need it. In my experience as a lead data/platform engineer, for the majority of analytics-heavy workloads BigQuery is faster to operate, cheaper to run (once tuned), and dramatically simpler to own than self-managed Spark clusters. Treat Spark as an occasional specialist tool - not the default.

Tags: spark performance optimization system-design BigQuery GCP