
Creating external tables in BigQuery on GCS data
In this article let us see how to create external tables in BigQuery on top of files stored in GCS, why you might choose this approach, and what limitations you should keep in mind before using it ...

In this article let us see how to create external tables in BigQuery on top of files stored in GCS, why you might choose this approach, and what limitations you should keep in mind before using it ...

In this article, let us see how to use AWS Step Functions together with AWS Glue for simple orchestration, and why this is often a good choice when you do not want to build a full scheduler or a he...

In this article, let us see how to get started with AWS Glue crawlers, what problem they solve, and why you might want to use them in a simple data lake setup. If you are keeping files in S3 and wa...

In this article, let us understand the basics of Unity Catalog in Databricks, why someone would use it, and how to think about it when you are just getting started. If you have been working with Da...

In this article let us go through how to use Databricks Delta tables in analytics pipelines and why this approach is useful when you want something more reliable than plain parquet files. If you ar...

In this article let us go through some practical basics for reducing BigQuery cost before it turns into a painful surprise in your monthly bill. If you are using BigQuery for analytics, reporting, ...

In this article let us see some cost optimization basics for AWS data pipelines, why they matter early, and what simple changes usually reduce the bill without making the platform too complicated. ...

In this article, let us see a few data quality checks that every beginner team should add early in their pipeline. This approach is useful because most data problems are not fancy platform problems...

In this article, let us see how to handle schema evolution in a data pipeline without breaking all the jobs that depend on it. This becomes important when your source system adds a column, renames ...

In this article let us see how to build reliable backfills in data pipelines, why we need them, and what things usually break when we run them in a hurry. Backfills sound simple at first. We missed...