Build software better, together

An easy to use, reliable and well designed python module that domain experts and data scientists can use to fetch, visualise, and transform publicly available satellite and LIDAR data.

data-pipeline lidar-point-cloud

Updated Jun 25, 2022
Jupyter Notebook

anna-geller / prefect-getting-started

Star

Get started with Prefect by scheduling your Prefect flows with GitHub Actions

python data-science data automation serverless pipeline scheduling orchestration data-engineering dataflow cicd data-pipeline prefect data-engineering-pipeline github-actions analytics-engineering data-engineering-infrastructure dataflow-ops

Updated Feb 7, 2023
Python

Navneet2409 / Airbnb-Booking-Analysis

Star

This project aims to do exploratory data analysis of the listings on Airbnb NYC 2019 Dataset (48895, 16). Airbnb has a global reach, and data analysis plays a crucial role in its operations.

python data-science numpy airbnb pandas seaborn data-analysis matplotlib data-cleaning data-pipeline

Updated Jun 15, 2023
Jupyter Notebook

Anshumaan-Chauhan02 / Scalability-Check-for-Machine-Learning-System-Predicting-Flight-Delays

Star

Checking the scalability of a data pipeline involving MySQL, Spark and Machine Learning Models using Latency.

mysql python machine-learning spark scalability data-pipeline

Updated Jul 10, 2023
Python

deepankarvarma / Extract-Transform-Load-Process-Techniques

Sponsor

Star

This repository contains code for comparing the performance of three different ELT (Extract, Load, Transform) methods on CSV files of different sizes. The three methods are implemented in Python using different approaches and libraries, and their execution times are compared and plotted for analysis.

python data-science csv multiprocessing multithreading sqlite3 data-processing data-pipeline extract-transform-load etl-pipeline producer-consumer-problem

Updated May 3, 2023
Python

bogumilo / house-prices-xgboost

Star

House prices dataset exploration and prediction. Workflow includes useful examples of Tensorflow pipelines including k-Nearest Neighbors imputer, Decision Tree Regression and XGBoost Regression

python machine-learning tensorflow data-transformation kaggle-competition xgboost decision-trees feature-engineering tensorflow-tutorial data-pipeline feature-scaling house-price-prediction decision-tree-regression gridsearchcv xgboost-regression column-transformer house-prices-competition knnimputer ohe

Updated Mar 14, 2023
Jupyter Notebook

Shubhammalik / component_failure_analysis

Star

The project works with aerospace component data, high number of part removals end up being unscheduled and that results in less flight time for the aircraft, this project deals with the historical data and tries to predict the parts failure on component level using survival analysis to avoid the mechanical induced disruptions.

machine-learning aerospace survival-analysis data-processing data-cleaning failure-detection data-pipeline data-formatting

Updated Jan 2, 2022
Jupyter Notebook

JanneImmonen / FootballDataConnector

Star

A custom Airbyte connector to fetch football data from the Football-Data.org API. It allows users to retrieve match results, league tables, and player statistics for specific leagues, making it a versatile tool for football data analysis.

python api connector etl football-data data-analysis data-integration data-pipeline sport-analytics airbyte

Updated Aug 13, 2023
Python

BayoAdejare / lightning-streams

Sponsor

Star

Batch/stream ETL pipeline of NOAA GLM dataset, using Python frameworks: Dagster, PySpark and Parquet storage.

python machine-learning streaming csv sql database clustering jupyter-notebook orchestration pyspark data-engineering spark-streaming parquet data-pipeline spark-sql data-warehousing k-means-clustering etl-pipeline noaa-data

Updated Sep 18, 2023
Python

aimlnerd / data-pipelines-with-airflow

Star

Airflow DAG tutorial with docker compose local setup

airflow data-pipeline airflow-docker mlops airflow-dags airflow-operators

Updated Oct 31, 2022
Python

DavidZucchet / Predicting-Responsiveness-Starbucks-Clients

Star

Optimizing offers/discounts send to Starbucks Clients using Machine Learning models on historical data.

python machine-learning optimization scikit-learn data-pipeline

Updated Mar 2, 2021
Jupyter Notebook

logleads / LogverzPortalAccess

Star

LOGVERZ PORTAL ACCESS. Logverz portal access is the "login" component of the Logverz application bundle (LogverzReleases Repository). Logverz is a serverless adaptive data pipeline, the fastest route from AWS S3 to instant reports.

serverless aws-s3 data-visualization cloud-computing data-pipeline low-code etl-pipeline adaptive-data-pipeline end-to-end-analytics

Updated Apr 16, 2024
Vue

PATRICIAJUNQUEIRA / Airflow_Pipeline_Gera_Pasta

Star

Pipeline de dados automatizado para extração e armazenamento de previsões meteorológicas para o setor de turismo.

python api airflow tourism data-engineering weather-forecast data-pipeline data-engineering-pipeline

Updated May 13, 2024
Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-pipeline

Here are 621 public repositories matching this topic...

willbamford / aws-data-pipeline-restore

suewoon / portfolio-assessment

andygeiss / pipeline-example

devathrajharish / udacity-dataengineer-nanodegree

Dukes-Wine-Co / url-shortner-pipeline

jomavera / datapipelineDataproc

swilliamc / SparkSQL

wakura-mbuya / AgriTech-Project-Data-Engineering

anna-geller / prefect-getting-started

Navneet2409 / Airbnb-Booking-Analysis

Anshumaan-Chauhan02 / Scalability-Check-for-Machine-Learning-System-Predicting-Flight-Delays

deepankarvarma / Extract-Transform-Load-Process-Techniques

bogumilo / house-prices-xgboost

Shubhammalik / component_failure_analysis

JanneImmonen / FootballDataConnector

BayoAdejare / lightning-streams

aimlnerd / data-pipelines-with-airflow

DavidZucchet / Predicting-Responsiveness-Starbucks-Clients

logleads / LogverzPortalAccess

PATRICIAJUNQUEIRA / Airflow_Pipeline_Gera_Pasta

Improve this page

Add this topic to your repo