GitHub - sergiorosendo/sale_price_prediction: A classic machine learning problem. This project showcases what I aspire for in a solution using Python. Code modularization, concise notebooks and easy installation.

Instructions

To run this project some python libraries are required. These are defined in requirements.txt.

The recommended way to install them is to use conda, or pip, and virtual environments.

Installing with conda

With Anaconda installed, execute the following in a anaconda command prompt :

Change to this project's folder

$ cd "path/to/folder"
Create a virtual environment.

$ conda create -y --name env-name
Activate your new environment.

$ conda activate env-name
Install the required libraries.

$ conda install --force-reinstall -y --name env-name -c conda-forge --file requirements.txt

Installing with pip

With Python 3+ and pip installed, execute the following in a command prompt :

Change to this project's folder

$ cd "path/to/folder"
Create a virtual environment.

$ py -m venv env-name
Activate your new environment.

$ .\env-name\Scripts\activate
Install the required libraries.

$ pip install -r requirements.txt

Using conda is preferred, pip may fail to install libraries like shap in Windows, if some microsoft visual studio resources are not already installed.

Executing Notebooks

Once you have the libraries installed you can edit the source code and interact with the notebooks yourself.

To open the notebooks, with the virtual environment activated, in this project's folder, execute:

$ jupyter notebook

This should open a browser tab in which you can select and run the notebooks.

Notebooks

Data Analisys (1_notebook)

The initial notebooks are for data exploration, they explore the data's quality and help to understand the features's overall distribution before going into predictions.

Model (2_notebook)

Here models are built, tested and interpreted.

Rerunning the prediction

It is possible to rerun the prediction process step by step.

To do so, trough the notebook IDE in notebook #2:

Restart the kernel, keeping only markdowns and source code:
```
 Go to Kernel > Restart & Clear Outputs
```

Start executing cells:

 Click '>| Run' to execute selected cell's source code

Testing different models

The notebook is designed as to allow for the easy testing of different models.

With little code edits it is possible to perform feature and classifier selection.

Feature selection

To test the impact of feature selection, edit Cell 3, taking features in and out of the corresponding list.

Classifier selection

To use a different classifier import it from scikit's module and use it as clf, like so:

from sklearn.ensemble import RandomForestRegressor
clf = RandomForestRegressor()

When defining the classifier it is also possible to use different hyper-parameters, using its optional arguments.

This way, with little programming experience, one can build, test and validate its own models and get a good grasp of the machine learning workflow.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

notebooks

notebooks

src

src

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Instructions

Installing with conda

Installing with pip

Executing Notebooks

Notebooks

Data Analisys (1_notebook)

Model (2_notebook)

Rerunning the prediction

Testing different models

Feature selection

Classifier selection

About

Releases

Packages

Languages

sergiorosendo/sale_price_prediction

Folders and files

Latest commit

History

Repository files navigation

Instructions

Installing with conda

Installing with pip

Executing Notebooks

Notebooks

Data Analisys (1_notebook)

Model (2_notebook)

Rerunning the prediction

Testing different models

Feature selection

Classifier selection

About

Resources

Stars

Watchers

Forks

Languages