Skip to content

wfbradley/sickle_stats

Repository files navigation

Statistical Modeling of Sickle Cell Disease

Table of Contents

The Problem

We wish to determine if an experimental treatment for sickle cell disease (SCD) is effective. Patients with SCD suffer from vaso-occlusive episodes (VOEs), which can be severe enough to require hospitalization. The treatment occurs at a single point in time, and we wish to determine if the rate of episodes after the treatment is significantly lower than the rate before the treatment.

(Because the treatment is invasive, ethical considerations prevent a double-blinded experiment.)

Installation

We assume you have already installed Python3 and PIP3.

To download this repo and the required libraries:

git clone https://github.com/wfbradley/sickle_stats.git
cd sickle_stats
pip3 install -r requirements.txt --user --upgrade

If the pip install fails for permission reasons, one can instead try

sudo -H pip3 install -r requirements.txt --upgrade

Data

Data is confidential so cannot be provided in GitHub. All confidential data should be put in the "confidential_data/" subdirectory.

The data is structured as follows...

Data is cleaned and put into a standard form by running

python 00_clean_data.py

Running Everything

All the principal scripts can be run, in order, with a single command:

python Master_sickle.py

Runtime should be under a minute. By default, output goes to data; figures go to data/figures; and no figures are plotted to screen. To display the figures, run

python Master_sickle.py --draw_plots

Models

The rate of episodes differs between patients, so many models have a hierarchical structure. So, given a patient, we first sample the severity of the disease for that individual; then conditional on the severity, we sample the time of the episodes for that patient.

Hierarchical Poisson

Episode process for patient i modeled as Poisson process of rate lambda_i. Distribution of lambda_i is, say, a Gamma(r,alpha) distribution. This is also called a compound poisson-gamma distribution, and is a type of Tweedie distribution.

Hierarchical Negative Binomial Process

Episode process for patient i modeled as a negative binomial process with parameters NB(r_i,p_i). The r_i can be sampled from another negative binomial, and p_i can be sampled from a beta distribution. (These should probably be correlated.)

Authors

This code was conceived and written by William Bradley and Karl Knaub in 2018.

License

This project is licensed under the MIT License. See LICENSE file for the complete license.

Copyright (c) 2018 William Bradley and Karl Knaub

Acknowledgements

About

Statistical analysis of sickle cell disease to model rate of vaso-occlusive episodes.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages