Data and source code for the paper "Gamified Incentives: A Badge Recommendation Model to Improve User Engagement in Social Networking Websites". [pdf]
Contains badges dataset for different time periods which are extracted from Badges.xml
data of Stack Overflow using dataset_generation/extract_badges.py
.
dataset2008
: Contains randomly generated train and test dataset from badges dataset usingdataset_generation/train_test_generation.py
for badges that are awarded in the year 2008. The complete dataset for years 2008 to 2010 are compressed in thedatasets.zip
file and should be uncompressed likedataset2008
to be used.
extract_badges.py
: Extracts badges for users which are awarded within thebegin_year
andend_year
time period from theBadges.xml
file. It writes the output as acsv
file in the format ofUserId,badge1,badge2,...
.train_test_generation.py
: Generates train and test set frombadges.csv
file forrecommendation/
algorithms.
collaborative_filtering.py
: Implementation of the item-based collaborative filtering method to recommend badges and evaluate the results.popular_badge_baseline.py
: Implementation of the baseline algorithm which recommends popular badges to each user.datasets.py
: This module contains the file paths of datasets for easier access in other modules. If you change the repository structure and the place of thedata
directory, you should modify_DATASET_ROOT
in this module accordingly so that it points to the root directory where datasets reside in.