Archives
- 22 Jul dbt Basics for Analytics Engineering Teams: A Practical Guide
- 15 Jul Querying Your S3 Data Lake with Amazon Athena — A Practical Guide
- 08 Jul S3 Data Lake Folder Design – Best Practices from the Trenches
- 01 Jul Batch vs Streaming: A Practical Guide for Beginner Data Engineers
- 24 Jun ETL vs ELT: A Practical Guide with Real Examples
- 17 Jun Medallion Architecture Explained: A Practical Guide to Bronze, Silver, and Gold Layers
- 10 Jun Databricks Workflows for Scheduled Jobs: A Practical Guide
- 03 Jun Getting Started with GCP Dataflow for Batch Pipelines: A Practical Guide
- 27 May BigQuery Partitioning and Clustering: A Practical Guide for Data Engineers
- 20 May AWS CDK for Data Platform Teams: A Practical Guide
- 13 May AWS Step Functions for Data Pipeline Orchestration: A Practical Guide
- 06 May Terraform Basics for Data Engineers: A Practical Walkthrough
- 29 Apr GitHub Actions for Data Pipeline CI/CD: A Practical Starting Point
- 22 Apr Building Your First Airflow DAG for ETL: A Practical Walkthrough
- 15 Apr Iceberg vs Delta Lake: A Practical Guide for Data Engineers
- 08 Apr Delta Lake Basics for Data Engineers — A Practical Guide
- 01 Apr Getting Started with Apache Spark on Databricks
- 25 Mar Apache Spark Transformations Every Data Engineer Should Know
- 18 Mar Databricks notebooks vs jobs for production work
- 11 Mar Small team data platform architecture on GCP
- 04 Mar A practical AWS data platform architecture for a small team
- 25 Feb Building a simple CDC ingestion pattern
- 18 Feb Handling retries and idempotency in ETL jobs
- 11 Feb Setting Up CI/CD for Terraform with GitHub Actions
- 04 Feb Partitioning Strategy for Data Lake Tables: A Practical Walkthrough
- 28 Jan Landing, Bronze, Silver, and Gold Layers Explained
- 27 Jan Creating external tables in BigQuery on GCS data
- 26 Jan Using Step Functions with AWS Glue for simple orchestration
- 25 Jan Getting started with AWS Glue crawlers
- 24 Jan Unity Catalog basics for beginners
- 23 Jan Using Databricks Delta tables for analytics pipelines
- 22 Jan Cost optimization basics for BigQuery workloads
- 21 Jan Cost optimization basics for AWS data pipelines
- 20 Jan Data quality checks every beginner team should add
- 19 Jan Schema evolution without breaking downstream jobs
- 18 Jan Building reliable backfills in data pipelines
- 17 Jan dbt basics for analytics engineering teams
- 16 Jan Using Athena to Query Data Lake Files
- 15 Jan S3 data lake folder design best practices
- 14 Jan Batch vs streaming for beginner data engineers
- 13 Jan ETL vs ELT with practical examples
- 12 Jan Medallion architecture explained simply
- 11 Jan Databricks Workflows for Scheduled Jobs
- 10 Jan Getting started with GCP Dataflow for simple batch pipelines
- 09 Jan BigQuery partitioning and clustering basics for faster and cheaper queries
- 08 Jan AWS CDK basics for data platform teams
- 07 Jan Using AWS Step Functions to Orchestrate a Data Pipeline
- 06 Jan Terraform basics for data engineering infrastructure
- 05 Jan Using GitHub Actions for simple data pipeline CI/CD
- 04 Jan Building your first Airflow DAG for ETL
- 03 Jan Iceberg vs Delta Lake for beginners
- 02 Jan Getting Started with Delta Lake: A Practical Guide for Data Engineers
- 18 Apr BigQuery Data Transfer Service: What, why and how?
- 08 Mar Productionizing dbt as a Cloud Run Job: Infrastructure Management with Terraform and CI/CD with GitHub Actions - Part 3
- 04 Mar Productionizing dbt as a Cloud Run Job: Infrastructure Management with Terraform and CI/CD with GitHub Actions - Part 2
- 01 Mar Productionizing dbt as a Cloud Run Job: Infrastructure Management with Terraform and CI/CD with GitHub Actions - Part 1