Skip to content

git-persistence/git-persistence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

git-persistence aims to measure git contributions of individuals on a repository and score them based on the quality of contributions. It is language-independent and it can track code that has been slightly modified or moved. It attributes each code character to its rightful author. The tool improves on many of the shortcomings of git-blame and other tools.

For further technical description on how it works refer to the following article (information to be updated soon):

Tsikerdekis, M. (in-press). Persistent Code Contribution: A Ranking Algorithm for Code Contribution in Crowdsourced Software. Empirical Software Engineering, (xx) xx, doi: 10.1007/s10664-017-9575-4.

Link: article link link2

Development and Future Updates

This project is under active development as we aim to build additional features and optimizations. A major upcoming version will include an update on features as well as an implementation based on C++. The core porting and optimization is being conducted by Wyatt Chapman. Once stable, the existing Python 3 will become deprecated and will be retained in a sub-folder for reference.

Licence

For licence information in regards to the Python implementation see LICENCE.txt

Documentation

For uses of the git_persistence.py module, the code has been extensively documented and an autogenerated (using epydoc) documentation can be found under the html directory

Installation

Requires: Python 3

python setup.py install

Example use (Python 3 implementation)

Utilizing git_persistence.py module

from git_persistence import GitPersistence
file1 = GitPersistence("This is a test!", "user1")
file1.update("I just changed your test!", "user2")
file1.update("I changed you test!\nThank you for helping out!", "user3")
file1.calculate_ownership()

Examples directory

pip3 install psutil # optional, but important if you don't install git-persistence globaly with setup.py
cd examples
git clone https://github.com/github/scientist.git
env PYTHONPATH=.. python3 run_git_persistence.py scientist

run_git_persistence.py outputs several files depending on how the run_git_persistence.py script is modified.

  • commits.tsv - contains a formatted list of all commits based on git log.
  • diff.html - may be outputted to show the final tracked characters in a visual form (unlikely for the whole repo).
  • files.tmp - file that keeps track of files to be processed (unimportant for output).
  • output-parallel.log - logs different results based on the parallel execution of git-persistence on the repository.
  • pa_per_rev.tsv - git-persistence scores results for each revision and file on repository (this can be used for establishing how scores may change over time).
  • persistence_scores.tsv - final persistence score results for the whole repository (per file). Results include an aggregate score for each user's character contributions as well as the mean score.
  • times.tsv - time it took to process different files (for debugging purposes).

About

git-persistence is a replacement for git-blame and can track code moves and character level quality of contributions

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published