Skip to content
#

spark-rdd

Here are 21 public repositories matching this topic...

This project utilizes PySpark DataFrames and PySpark RDD to implement item-based collaborative filtering. By calculating cosine similarity scores or identifying movies with the highest number of shared viewers, the system recommends 10 similar movies for a given target movie that aligns users’ preferences.

  • Updated Jul 30, 2023
  • Jupyter Notebook

This project utilizes PySpark RDD and the Breadth-first Search (BFS) algorithm to find the shortest path and degrees of separation between two given Marvel superheroes based on based on their appearances together in the same comic books, empowering users to discover connections between their favourite superheroes in the Marvel universe.

  • Updated Jul 30, 2023
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the spark-rdd topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-rdd topic, visit your repo's landing page and select "manage topics."

Learn more