Data-Driven Software Engineering Studies
-
Updated
May 28, 2024 - Dockerfile
Data-Driven Software Engineering Studies
Efficient data transformation and modeling framework that is backwards compatible with dbt.
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
This course is designed to provide learners with the fundamental skills needed for data engineering using Python. The objective is to introduce anyone interested in the topic to Python's data engineering-related features.
An open-source project dedicated to constructing robust data pipelines and scalable software infrastructure. We leverage industry-standard tools favored by developers to enhance efficiency and reliability. Uniquely, these pipelines are field-tested on farms across Sumatra, Indonesia, ensuring real-world applicability and resilience.
Apache Beam demo projects
This is a repo with links to everything you'd ever want to learn about data engineering
10 years of Premier League teams statistics for analysis
The developer framework for your data & analytics stack
Deploy Selenium And Merge It With PyWebIo = WebApp For Scraping News Web Count(10) Then Tr Into Ar Lang
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
In this Project, I'll be building a real-time data streaming pipeline, covering each phase from data ingestion to processing and finally storage. We'll utilize a powerful stack of tools and technologies, including Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra—all neatly containerised using Docker.
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Enhance your data testing seamlessly with this Dataform package, featuring robust common assertions to ensure the accuracy and integrity of your warehouse data.
A simple and lightweight data engineering template using Apache Airflow and Metabase
Black Women Data 2022 - Power BI Workshop Materials from a 4 hour personal dashboard challenge.
An open source development framework to help you build data workflows and modern data architecture on AWS.
Add a description, image, and links to the dataengineering topic page so that developers can more easily learn about it.
To associate your repository with the dataengineering topic, visit your repo's landing page and select "manage topics."