COVID-19 Clustering

This is a part of the course TDT4173 - Machine Learning at NTNU. The project proposal is available here.

Clustering methods being evaluated:

Agglomerative Clustering
BIRCH
DBSCAN
k-Means
Mean Shift
Spectral Clustering

Data set

This project uses the "Mortality risk of COVID-19"-dataset from Our World in Data https://ourworldindata.org/mortality-risk-covid. The dataset contains country-by-country data on mortality risk of the COVID-19 pandemic.

Installation guide

Prerequisites

Python (version 3.8 or higher)
Some kind of package manager. We recomend using conda or pip.

Installing dependencies

All the project dependencies are listed and pinned to a specific version in requirements.txt which can easily be installed.

If you are using conda, run the following at the command-line:

conda install --file requirements.txt

If you are using pip, run the following at the command-line:

pip install -r requirements.txt

Running scripts

Important! All the python files are assumed to be executed from root. Do not try to run scripts from a sub directory (such as src).

👍 Example of correct usage:

.../covid-19-clustering python src/preprocessing.py

👎 Example of incorrect usage:

.../covid-19-clustering/src python preprocessing.py

File strucure

The files found in the notebooks folder are jupyter notebooks. data contains raw csv files from OWID, as well as processed and cleaned files.

📂covid-19-clustering
┣ 📁.github (CI config)
┣ 📁.vscode (vscode editor config)
┣ 📁data (raw, clean, and processed csv files)
┣ 📁models (persisted models with metadata)
┣ 📁notebooks (jupyter notebooks)
┣ 📁results (clustering assignment and metrics for each model as well as plots)
┣ 📁src
┃ ┣ 📂evaluation (Python scripts for comparing models)
┃ ┣ 📂model (Python scripts for training and presisting the models)
┃ ┣ 📂visualization (Python scripts for making visualizations)
┃ ┣ 📜preprocessing.py (Same as EDA in 📁notebooks for data cleaning and preprocessing)
┃ ┣ 📜utils.py
┣ 📁tests
┣ 📜.flake8
┣ 📜.gitignore
┣ 📜project_proposal.md
┣ 📜README.md (this file)
┣ 📜requirements.txt (3rd-party dependencies / packages)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

data

data

models

models

notebooks

notebooks

results

results

src

src

tests

tests

.flake8

.flake8

.gitignore

.gitignore

README.md

README.md

project_proposal.md

project_proposal.md

requirements.txt

requirements.txt

Repository files navigation

COVID-19 Clustering

Data set

Installation guide

Prerequisites

Installing dependencies

Running scripts

File strucure

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.github/workflows		.github/workflows
data		data
models		models
notebooks		notebooks
results		results
src		src
tests		tests
.flake8		.flake8
.gitignore		.gitignore
README.md		README.md
project_proposal.md		project_proposal.md
requirements.txt		requirements.txt

batherk/covid-19-clustering

Folders and files

Latest commit

History

Repository files navigation

COVID-19 Clustering

Data set

Installation guide

Prerequisites

Installing dependencies

Running scripts

File strucure

About

Resources

Stars

Watchers

Forks

Languages