Skip to content

LarsSven/EBSE_SATD_Replication

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 

Repository files navigation

EBSE Replication Package

This repository contains code and data to reproduce the results of our research project about Self-Admitted Technical Debt in Pull Requests for the course Evidence-Based Software Engineering.

0. Install prerequisites

$ pip install -U requests lizard scipy scikit-learn

1. Preprocess original dataset

$ cd Preprocessing
$ python preprocess.py
$ cp new.csv ../QualitativeAnalysis/Round1/sampling_input.csv

2. Perform sampling for qualitative analysis rounds

Round 1 and 2:

$ cd ../QualitativeAnalysis/Round<x>
$ python ../subsample.py
$ cp nonsampled.csv ../Round<x+1>/sampling_input.csv

Round 3:

$ cd ../Round3

In subsample.py, set NUM_SAMPLES 172 (1/3rd of the total). Then, sample for each of the researchers (Lars, Germán, Koen):

$ python ../subsample.py

Use the nonsampled.csv output as the input file for the next researcher.

Round 4 (verification):

Set NUM_SAMPLES to 17. Then sample from Round3/sampled_{Lars,German,Koen}.csv:

$ cd ../Round4
$ python ../subsample.py

3. Calculate kappa score

$ cd Agreement
$ python kappa.py

4. Retrieve code changes

$ cd ../../QuantitativeAnalysis
$ python codechanges.py

5. Retrieve project languages

$ cd Languages
$ python languages.py

6. Merge dataset with categories

$ cd ../Data_analysis
$ python merge_categories.py

7. Generate raw data for source code analysis

$ python generate_metrics.py

8. Perform statistical significance tests

$ cd ../Statistical_Significance
$ python levene.py
$ python ttest.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published