
Small team data platform architecture on GCP
In this article, let us look at a simple data platform architecture on GCP that works well for a small team. This approach is useful when you need to ingest files or events, do some light to medium...

In this article, let us look at a simple data platform architecture on GCP that works well for a small team. This approach is useful when you need to ingest files or events, do some light to medium...

In this article, let us see how to put together a small team data platform on AWS without creating a huge platform engineering project for ourselves. This kind of setup is useful when the team want...

In this article let us see a simple CDC ingestion pattern that works well when you want to bring source system changes into a data lake or warehouse without building something too fancy on day one....

In this article let us see how to handle retries and idempotency in ETL jobs, and why this matters when a pipeline fails halfway and we need to run it again without creating bad data. Most teams st...

In this article, let us look at how to set up a proper CI/CD pipeline for Terraform using GitHub Actions. If you have been running Terraform from your local machine, you might have noticed it works...

In this article, I want to walk through how we approach partitioning for data lake tables. I have seen this done wrong enough times that I think it is worth writing down what actually works in prac...

In this article let us walk through the medallion architecture pattern — landing, bronze, silver, and gold layers — and why teams use this approach when building data lakehouses. If you are coming ...

In this article let us see how to create external tables in BigQuery on top of files stored in GCS, why you might choose this approach, and what limitations you should keep in mind before using it ...

In this article, let us see how to use AWS Step Functions together with AWS Glue for simple orchestration, and why this is often a good choice when you do not want to build a full scheduler or a he...

In this article, let us see how to get started with AWS Glue crawlers, what problem they solve, and why you might want to use them in a simple data lake setup. If you are keeping files in S3 and wa...