airflow-dags

Here are 246 public repositories matching this topic...

Manny-Brar / DataEngineeringNanodegree-P5-DataPipelines-Airflow-Spark-AWS

Utilizing Airflow's built-in functionalities creating a reusable ETL pipeline. Source data resides in a S3 bucket, and the pipeline should include data quality checks and data should be processed within AWS Redshift.

aws airflow spark etl aws-s3 data-warehouse pyspark data-pipelines aws-redshift dataengineering airflow-dags

Updated Sep 27, 2020
Python

contactsunny / apache_airflow_poc

Sponsor

Star

A POC to demonstrate how to work with Apache Airflow

python data-science airflow big-data bigdata apache apache-airflow airflow-dags

Updated Oct 10, 2021
Python

santiagortiiz / Apache-Airflow-Foundations

Star

Hands up project for learning purposes

python docker learning airflow docker-compose apache-airflow airflow-dags

Updated May 11, 2023
Python

airflow-helm / airflow-dag-aggregator

Star

An example repo which aggregates multiple sources of Apache Airflow DAGs from Apache Maven repositories into a single Git branch that can be used with git-sync in the Airflow Helm Chart (User Community).

kubernetes airflow maven helm helm-charts git-sync airflow-dags

Updated Dec 19, 2022

AntonioLunardi / Automation-of-data-treatment-weather-API-and-Airflow

Star

A process of data cleaning and saving results automatically in separated folders is done using Apache Airflow and a weather API. Specifically the weather data of LA is used.

python api weather airflow data-engineering weather-api los-angeles apache-airflow airflow-dags vitrinedev

Updated Jul 15, 2023

belladzhu / airflow-projects

Star

Создание дагов для автоматизации отчетности

airflow numpy pandas airflow-dags

Updated Aug 29, 2023
Python

Susanhuynh / etl-API-data-to-AWS-S3-using-Airflow

Star

This project focuses on utilizing Apache Airflow to orchestrate an ETL (Extract, Transform, Load) process using data from the Stack Overflow API. The primary objective is to determine the most prominent tags on Stack Overflow for the current month.

docker airflow etl data-engineering amazon-s3 airflow-dags

Updated Jan 28, 2024
Python

Niangmohamed / Cloud-Composer-Advanced-Lab

Star

Cloud Composer: Copying BigQuery tables across different locations.

airflow apache-airflow cloud-composer airflow-dags

Updated May 24, 2022
Python

Tinmarian / Apache_Airflow

Star

Repositorio para realizar el curso llamado "Apache Airflow" en Udemy, expuesto por Javier López Tomás (Data Engineer)

airflow python3 airflow-plugins airflow-dags airflow-operators

Updated Feb 14, 2023
Python

OleksandrCherniavskyi / 5.AirflowDataPipeline

Star

AirflowDataPipeline is a daily data collection project that extracts JSON data from a website's API and loads it into a SQL database using Airflow. This project offers an automated and reliable solution for managing data pipelines.

docker json pandas requests airflow-dags

Updated Aug 16, 2023
Python

Penosh22 / Market-news-and-forex-exchange-ETL-pipeline

Star

ETL pipeline to extract webscraped forex data and google news library data on daily basis and storing it in a postgres database for market analysis and insights

postgresql-database airflow-docker etl-pipeline airflow-dags

Updated Sep 12, 2023
Python

ilyesdjerfaf / Reddit-European-Analysis

Star

This repository focuses on conducting weekly analyses of European Reddit data. It employs a data pipeline orchestrated with Airflow, scheduled to run on a weekly basis.

python docker airflow reddit data-engineering praw powerbi etl-pipeline airflow-dags supabase astro-cli

Updated Jan 24, 2024
Python

igorlangoni / online_retail_data_pipeline

Star

An end-to-end pipeline that ingests raw data from CSV files through Airflow DAGS into BigQuery. From there, it uses dbt to normalize and clean the data and afterwards to make the transformations and come up with relevan reports.

docker bigquery airflow python3 metabase dbt airflow-docker airflow-dags

Updated Jan 26, 2024
Python

masamerc / fennec-alert

Star

An airflow pipeline which scrapes daily items data from rocket-league.com, alerts if condition is met and sends data to MongoDB

docker python3 airflow-docker airflow-dags airflow-pipeline

Updated May 26, 2021
Python

xennen / DataEngineerYP

Star

Data Engineer projects

python docker kubernetes redis sql kafka mongodb hadoop postgresql pyspark vertica airflow-dags

Updated May 14, 2024
Python

WALIDAADI / ETL_using_Airflow

Star

This project presents a robust data pipeline using Apache Airflow for orchestration, Apache Kafka for real-time data streaming, and MongoDB for data storage. It automates the process of web scraping to collect large companies' data, transforms and processes this data, and then stores it efficiently.

docker pipeline airflow-dags