personalized-nlp

Personalized prediction applied to various subjective natural language processing (NLP) tasks

Download data

To download data, enter personalized_nlp folder and type in:

dvc pull

How to run experiments:

First, preprocess selected dataset with scripts.process_data pipeline, to assigns folds to texts:

python -m scripts.process_data --annotations_df_path .../personalized-nlp/storage/data/unhealthy_conversations/uc_annotations.csv --texts_df_path .../personalized-nlp/storage/data/unhealthy_conversations/uc_texts.csv --annotator_col annotator_id --num_folds 5 --text_col text_id

Then, you can run experiments with:

python -m personalized_nlp.experiments.uhnealthy

How to add DataModule for a new dataset

Copy one of the existing dataset classes (personalized_nlp/datasets/) and modify paths and settings. Next, copy one of the experiments (personalized_nlp/experiments/) and customize the settings.

How to select folding setup

Set the stratify_folds_by argument in datamodule: None for standard train-val-test split, 'users' for users/past-present-future1-future2 split and 'texts' for texts folds.

Name		Name	Last commit message	Last commit date
Latest commit History 318 Commits
.dvc		.dvc
active_learning		active_learning
notebooks		notebooks
personalized_nlp		personalized_nlp
scripts		scripts
settings		settings
storage		storage
.dvcignore		.dvcignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
create_venv.sh		create_venv.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.dvc

.dvc

active_learning

active_learning

notebooks

notebooks