-
Updated
May 31, 2024 - Java
hadoop-hdfs
Here are 286 public repositories matching this topic...
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
-
Updated
May 31, 2024 - Go
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
-
Updated
May 26, 2024 - Jupyter Notebook
A fully-functional Hadoop Yarn cluster as docker-compose deployment.
-
Updated
May 25, 2024 - Shell
More than 2000+ Data engineer interview questions.
-
Updated
May 20, 2024
big data project, information storage in hdfs
-
Updated
May 18, 2024
This is a comprehensive solution for real-time football analytics, leveraging Apache Spark execution on yarn for both streaming and batch processing, Hadoop HDFS for distributed storage, Kafka for real-time data ingestion, rethinkdb for live data updates and Next.js for data visualization. as well as a custom built search engine.
-
Updated
May 14, 2024 - TypeScript
ETL Pipeline for Spar Nord Bank for the analysis of refilling frequency of the ATM's all over the europe
-
Updated
May 7, 2024 - Jupyter Notebook
PyHDFS: Scalable & resilient distributed file system. Components: Zookeeper, NameNode, DataNode, Metadata service, Client. Setup guide for AWS & local. Explore distributed storage!
-
Updated
Apr 27, 2024 - Python
Hadoop Ecosystem - 대규모 빈발 패턴 마이닝을 위한 하둡 클러스터 환경 구축
-
Updated
Apr 23, 2024 - Shell
Proceso ETL
-
Updated
Apr 15, 2024 - Jupyter Notebook
Docker image builds for Hadoop sandbox.
-
Updated
Apr 2, 2024 - Dockerfile
Netflix Filtering and Recommendation Project
-
Updated
Apr 1, 2024 - Jupyter Notebook
Average Temperature - Hadoop - Mapper - Reducer
-
Updated
Mar 26, 2024 - Scala
Leverage the power of Apache Spark for large-scale data processing and analysis
-
Updated
Mar 21, 2024 - Jupyter Notebook
旅游网站(携程网部分数据)大数据分析-hadoop课程设计(本科课设级别)
-
Updated
Mar 16, 2024 - Java
Implémentation d'une pipeline permettant de faire la prédiction de la maladie de parkinson via des outils d'IoT, Cloud, et Big Data
-
Updated
Mar 3, 2024 - Python
My first data analytics project I am creating along with the Data Analytics Essentials course by Cisco Networking Academy.
-
Updated
Feb 20, 2024 - Python
Improve this page
Add a description, image, and links to the hadoop-hdfs topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the hadoop-hdfs topic, visit your repo's landing page and select "manage topics."