Best practices for data workflows, integrations with the Modern Data Stack (MDS), Infrastructure as Code (IaC), Cloud Provider Services
-
Updated
Mar 24, 2025 - HCL
Best practices for data workflows, integrations with the Modern Data Stack (MDS), Infrastructure as Code (IaC), Cloud Provider Services
CI/CD repository template to automate deployments of your production flows
Easily deploy airflow infrastructure on an AWS VPC using terraform.
Wife approved HomeOps driven by Kubernetes and GitOps using ArgoCD
Yelp Data Processing Pipeline on GCP
Automated setup of Apache Iceberg on Amazon S3 using Terraform and AWS Glue Data Catalog. Explore the power of a Lakehouse architecture for data management and analysis, featuring schema discovery, metadata management, and efficient querying with Amazon Athena.
...an automated data pipeline that retrieves cryptocurrency data from the CoinCap API, processes and transforms it for analysis, and presents key metrics on a near-real-time dashboard
A data engineering project with dbt, Docker, Kestra, Terraform, GCP and Looker.
Bring Infrastructure as Code best practices to your data workflows with Kestra and Terraform
end-to-end data pipeline to ingest and visualize the acm icpc world finals dataset
This project aims to give an overview of key data enabling activities. These are governance related objectives, but with a focus on being able to manage and utilize data to deliver business value.
🚕 A containerized polars ELT data pipeline deployed to AWS ECS using Terraform, orchestrated by Airflow, data storage using Delta Lake on S3, and CI/CD using Gitlab.
Project to Learn Data analytics in AWS using twitter data
Google Cloud resources for Masthead Data agent integration.
A terraform module to copy BigQuery datasets across regions
An end-to-end AWS data engineering pipeline for processing, and visualizing historical climate data for Russia.
Data Engineering Zoomcamp course assignments and notes.
Analyzes Chicago taxi trips using BigQuery, dbt, and Terraform to track top drivers’ tip trends over time. Results are stored in Google Sheets.
This repository contains infrastructure code for the Wizeline Data Engineering Bootcamp (DEB) 2023. It is one of two repositories for the DEB. The other (deb-application) houses the application code.
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."