In this repository, we'll be using a NHCS dataset on patient use of drugs to understand patterns of health care delivery and utilization in the U.S. with a focus on opioid and drug overdose from 2020-2023.
For our final project, we'll perform the following operations on this dataset:
- Data Cleaning/Preparation
- Exploratory Data Analysis
- Model Selection
- Model Analysis
- Conclusion and Recommendations.
The NHCS collects data on patient care in hospital-based settings to offer insights into health care delivery patterns in the U.S. While the data from 2020-2023 is preliminary and not nationally representative, it can provide valuable insights into the use of opioids and other overdose drugs.
- Data from 25 hospitals for inpatient and 25 hospitals for emergency departments (ED).
- Data spans from January 1, 2020, to May 27, 2023.
- The dataset includes information on various indicators related to drug use, such as overall drug use, comorbidities, drug, and polydrug overdose.
data/
: Folder containing the raw data.notebooks/
: Jupyter notebooks for data analysis, model selection, and evaluation.README.md
: This file.
-
Introduction
- Background of the NHCS and its importance.
- Overview of the opioid crisis and the relevance of the dataset.
-
Data Cleaning/Preparation
- Data wrangling steps.
- Handling missing values, outliers, and data transformations.
-
Exploratory Data Analysis
- Data distributions, trends, and patterns.
- Visualizations of key metrics and features.
-
Model Selection
- Criteria for model selection.
- Comparisons of different models and their performance metrics.
-
Model Analysis
- Detailed analysis of the selected model.
- Feature importance, model evaluation, and validation.
-
Conclusion and Recommendations
- Key findings from the analysis.
- Recommendations for health care policy, hospital practices, and further research.
-
Appendix
- Output of code from the technical Jupyter Notebook.
- Clone this repository.
- run
jupyter notebook
- Navigate to the
notebooks/
directory and open the project notebook. - Follow the instructions in the notebook to run the analysis.
- Python
- Jupyter Notebook
- Libraries: pandas, numpy, matplotlib, seaborn, scikit-learn
Brian Morris, Will Kencel
This project is licensed under the MIT License.