Welcome to the Data Analytics course repository from the University of Utrecht, completed during the Erasmus program. This repository contains the work of my group for the course INFOB2DA.
The repository is organized into four assignments, each located in its own folder. Here's a brief overview of each assignment:
- Assignment 1: Data Understanding and Preprocessing
- Description: The students are given a dataset of anonymous mammographic masses. Core problem is to expore some basic functionalities for visualitazion, data understanding and data reduction (PCA, TSVD)
- Folder: Mammography masses
- Assignment 2: Dashboard visualization and coordinated view
- Description: The aim is now for the students to create a dashboard to visualize interactivly some insight of the dataset, reguarding the american flight's delays in 2008. We used Plotly to generate said dashboard.
- Folder: Airlines delay
- Assignment 3: Clustering methods and distance functions
- Description: The students are given a dataset of customer behaviour on a certain online shopping platform. The aim is to apply clustering algorithms to identify different types of customers through algorithms like BIRCH or DBSCAN.
- Folder: Online shoppers intentions
- Assignment 4: Classification methods and model evaluation
- Description: The aim is for the students to apply some classification algorithms to predict if a blood donator will come back to the clinic in the following month. Some of the techinques are for example KNN e SVC.
- Folder: Blood transfusion