Skip to content

Machine Learning, EDA, Binary Classification task weather dataset, ANN, SVM, LR

License

Notifications You must be signed in to change notification settings

sondosaabed/Weather-Dataset-Analysis

Repository files navigation

Weather-Dataset-Analysis:

This project was created as part of Machine Learning Course at BZU. After completing the analysis of the Weather Data provided, the results can be found in the document provided. Based on the analysis conclusions were found that will better decision making concerning the weather if it will rain tomorrow or not.

Features overview:

Histograms that shows the features distribution:

image

Outliers detection using box plots:

image

Showing Data corelation using Heat map:

image

Models trained

Logestic reggression:

image

Support Vector Machine:

image

Artificial Neural Network:

image

Conclusions

The analysis of the weather dataset revealed the presence of missing values and outliers that were effectively handled using KNN imputer and Capping and Flooring. It also showed that the ranges of some features may dominate the others in respect to their contribution to the classification task so feature scaling had to be performed.

Multivariate analysis was performed to determine the correlation between features and the target, leading to the removal of features with low positive correlation and highly correlated features to avoid redundancy.

After evaluating the performance of three different classification algorithms (LR, ANN and SVM), the ANN classifier was found to be the best performer with the highest ROC/AUC score and precision. For that reason the selected algorithm will be ANN to predict whether it will rain tomorrow or not.

These results can be valuable for Al-Bireh municipality in making informed decisions about weather predictions, such as allocating resources for potential rain-related events or planning outdoor activities based on the predicted weather. By using the best performing classifier, the municipality can have more confidence in its weather predictions and respond more effectively to potential weather-related challenges.