Archives
- 18 Mar Databricks notebooks vs jobs for production work
- 11 Mar Small team data platform architecture on GCP
- 04 Mar A practical AWS data platform architecture for a small team
- 25 Feb Building a simple CDC ingestion pattern
- 18 Feb Handling retries and idempotency in ETL jobs
- 11 Feb Setting Up CI/CD for Terraform with GitHub Actions
- 04 Feb Partitioning Strategy for Data Lake Tables: A Practical Walkthrough
- 28 Jan Landing, Bronze, Silver, and Gold Layers Explained
- 27 Jan Creating external tables in BigQuery on GCS data
- 26 Jan Using Step Functions with AWS Glue for simple orchestration
- 25 Jan Getting started with AWS Glue crawlers
- 24 Jan Unity Catalog basics for beginners
- 23 Jan Using Databricks Delta tables for analytics pipelines
- 22 Jan Cost optimization basics for BigQuery workloads
- 21 Jan Cost optimization basics for AWS data pipelines
- 20 Jan Data quality checks every beginner team should add
- 19 Jan Schema evolution without breaking downstream jobs
- 18 Jan Building reliable backfills in data pipelines
- 17 Jan dbt basics for analytics engineering teams
- 16 Jan Using Athena to Query Data Lake Files
- 15 Jan S3 data lake folder design best practices
- 14 Jan Batch vs streaming for beginner data engineers
- 13 Jan ETL vs ELT with practical examples
- 12 Jan Medallion architecture explained simply
- 11 Jan Databricks Workflows for Scheduled Jobs
- 10 Jan Getting started with GCP Dataflow for simple batch pipelines
- 09 Jan BigQuery partitioning and clustering basics for faster and cheaper queries
- 08 Jan AWS CDK basics for data platform teams
- 07 Jan Using AWS Step Functions to Orchestrate a Data Pipeline
- 06 Jan Terraform basics for data engineering infrastructure
- 05 Jan Using GitHub Actions for simple data pipeline CI/CD
- 04 Jan Building your first Airflow DAG for ETL
- 03 Jan Iceberg vs Delta Lake for beginners
- 02 Jan Getting Started with Delta Lake: A Practical Guide for Data Engineers
- 18 Apr BigQuery Data Transfer Service: What, why and how?
- 08 Mar Productionizing dbt as a Cloud Run Job: Infrastructure Management with Terraform and CI/CD with GitHub Actions - Part 3
- 04 Mar Productionizing dbt as a Cloud Run Job: Infrastructure Management with Terraform and CI/CD with GitHub Actions - Part 2
- 01 Mar Productionizing dbt as a Cloud Run Job: Infrastructure Management with Terraform and CI/CD with GitHub Actions - Part 1