Distributed Systems group project
-
Updated
Dec 2, 2017 - CSS
Distributed Systems group project
Projeto de BigData Mestrado Mackenzie
This project aims to predict the delays on the Yellow taxi dataset, by implementing an application based on Apache Flink.
A Reliable Benchmark Framework for Streaming Systems. This is an instrumentation tool based on DS2, Flink and NEXMark in order to evaluate the performance of streaming systems.
Includes examples for Apache Flink
Distributed Random Forest in Apache Flink
Stream processing of AIS data with Flink
Prink (Privacy-Preserving Flink) is a data anonymization solution for Apache Flink, that provides k-anonymity and l-diversity for data streams.
Flink Example
Demo 2020
This project focuses on building a real-time streaming pipeline using Apache Flink and Apache Kafka. The goal is to enrich checkout data with user information, identify the first click leading to a checkout, and log the attributed checkouts into a Postgres sink table. The project implements concepts like state management, time attributes, watermark
Using Apache Flink to write to s3 in Apache Iceberg format
installs packages from https://archive.apache.org/dist/
Add a description, image, and links to the apache-flink topic page so that developers can more easily learn about it.
To associate your repository with the apache-flink topic, visit your repo's landing page and select "manage topics."