Spark's assignment using SparkSQL and Spark Streaming processing with Kafka. Calculating spaceships consumptions.
-
Updated
Jan 1, 2019 - Scala
Spark's assignment using SparkSQL and Spark Streaming processing with Kafka. Calculating spaceships consumptions.
Reprodicing Census SIPP Reports Using Apache Spark
spark with scala, including rdd, transform, action, hdfs, sparkSQL, dataframe and mllib
Designed a Machine Learning model which takes newsgroup dataset and performs binary classification to predict if a given document has Atheistic or Christian sentiment. Used LIME library and PySpark. Performed feature selection to improve classifier’s performance.
OpenStreetMap Data Analysis with Python programming language.
Weather Data Analysis using Python, Pandas, SparkSQL, AutoRegression Model
Contains an analysis of key home sales metrics using SparkSQL and Python to manage large amounts of data.
In this repository, Google Collab is paired with SparkSQL to determine key metrics about home sales data. Spark is also used to create temporary views, partition data, and cache/unchache a temporary table in the process.
A fun place for me to blog about distributed databases, aerial arts, and life in general
Add a description, image, and links to the sparksql topic page so that developers can more easily learn about it.
To associate your repository with the sparksql topic, visit your repo's landing page and select "manage topics."