#

hadoop-ecosystem

Here are 40 public repositories matching this topic...

madd86 / awesome-system-design

A curated list of awesome System Design (A.K.A. Distributed Systems) resources.

distributed-systems microservices nosql interview stream-processing microservices-architecture relational-database message-broker hadoop-ecosystem

Updated Mar 26, 2024

ZuInnoTe / hadoopoffice

HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)

spark hive hadoop excel bigdata office poi flink hadoop-ecosystem hadoopoffice analyze-office-documents

Updated Oct 29, 2022
Java

dhkdn9192 / data_engineer_career

DE직무에 필요한 모든 것

interview-questions data-engineer hadoop-ecosystem

Updated May 24, 2024
Jupyter Notebook

Jayvardhan-Reddy / BigData-Ecosystem-Architecture

Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.

kafka big-data spark hive hadoop architecture bigdata hbase zookeeper spark-streaming hdfs sqoop hadoop-ecosystem architecture-components yarn-hadoop-cluster bigdata-module hbase-cluster big-data-essentials hadooparchitecture

Updated Sep 10, 2019
Shell

Cigna / ibis

IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.

workflow hadoop ingestion oozie sqoop sqoop2 workflow-automation workflow-scheduler hadoop-ecosystem hadoop-framework ibis cigna

Updated Apr 13, 2022
Python

jodth07 / hadoop-installation

Instructions on setting up Hadoop, HDFS, java, sbt, kafka, scala, spark and flume on Ubuntu 18.04

scala kafka spark hadoop sbt installation flume kafka-installation hadoop-ecosystem hadoop-installation hadoop-hdfs spark-installation sbt-installation scala-installation

Updated Jul 17, 2021
Shell

pfisterer / apache-knox-docker

Dockerfile for running Apache Knox (http://knox.apache.org/) in Docker

dockerfile hadoop rest-api hadoop-cluster hadoop-ecosystem apache-knox gateway-server

Updated Mar 21, 2022
Dockerfile

SarahAyaz / YouTube_Data_Analysis

Analysis of YouTube Data using Hadoop Mapreduce framework in Java.

java linux youtube hadoop analysis hdfs mapreduce hadoop-filesystem hadoop-mapreduce hadoop-ecosystem mapreduce-java hadoop-hdfs partitioner

Updated Jan 30, 2022
Java

meliodaseren / spark-sql-demo

SparkSQL Quick Start Tutorial

spark sparksql hadoop-ecosystem

Updated Oct 7, 2017
Scala

AnkitaSinha98 / Customer360-Data-Analysis

Big Data is Stored and analyzed of various Customer using Hadoop and other tools like Hive, Zookeeper, Hbase and sqoop and all details of the customer is analyzed then result are given.This result is very useful for companies.

hive hadoop hbase zookeeper dataset pig sqoop hadoop-ecosystem big-data-analytics

Updated Feb 10, 2021

satyajeetmaharana / floodprediction

The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.

machine-learning spark hadoop geospatial geojson-data hdfs sparksql tableau geojson-schema spark-mllib hadoop-ecosystem big-data-analytics hadoop-framework geojson-polygon flood-predictions

Updated Jan 20, 2020
Scala

uncleislearning / learning-Hadoop

HDFS、MapReduce、Hive、Zookeeper原理以及实践操作

hadoop hadoop-cluster hadoop-filesystem hadoop-mapreduce hadoop-ecosystem

Updated Feb 15, 2018

saitejavishalj / Hotspot-analysis-of-Geospatial-data

Built a Large Scale Distributed Data Processing system for Streaming Analytics using Hadoop Ecosystem (Apache Spark and HDFS), in Cloud for real-time spatial analytics.

distributed-systems apache-spark hdfs data-analysis sparksql large-scale hadoop-ecosystem streaming-analytics apache-hadoop

Updated Jun 4, 2021
Scala

rahulsakore7 / Unstructured-data-mart-sentimental-analysis

visualization tableau predictive-modeling datamart hadoop-ecosystem unstructured-data dataanalytics

Updated Jun 16, 2018
Jupyter Notebook

mayankskb / Hadoop-Times

Practise programs in hadoop ecosystem for refrence

hive hadoop mapreduce hadoop-ecosystem

Updated Sep 14, 2018

alex-ber / docker-hive

EMR 5.25.0 cluster single node Hadoop docker image. With Amazon Linux, Hadoop 2.8.5 and Hive 2.3.5

Updated Jan 6, 2020
Shell

f2e-awesome / HadoopEcosystem

Hadoop 生态体系(ecosystem)

hive hadoop avro hbase zookeeper mahout pig hdfs flume ambari bigtable sqoop hadoop-filesystem hadoop-mapreduce hadoop-ecosystem hcatalog

Updated Jul 5, 2018
JavaScript

hyeonsangjeon / dataplatform

Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.

hive hadoop hadoop-cluster hadoop-mapreduce hadoop-docker pyspark-notebook zeppelin-notebook hadoop-ecosystem

Updated Nov 7, 2019
Shell

reggert / cumulative

[Work in progress] Client library for simplified access to Apache Accumulo

scala spark bigdata accumulo hadoop-ecosystem

Updated May 7, 2020
Scala

ArwaEiad / TMDB-Project

This project focuses on analyzing movie data using Pyspark tailored for efficient data processing on Hadoop Distributed File System (HDFS)

pyspark hdfs hadoop-ecosystem

Updated May 6, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the hadoop-ecosystem topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hadoop-ecosystem topic, visit your repo's landing page and select "manage topics."