Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark/Tag repo as dead/unactive #2536

Open
x-N0 opened this issue Jan 7, 2021 · 12 comments
Open

Mark/Tag repo as dead/unactive #2536

x-N0 opened this issue Jan 7, 2021 · 12 comments
Labels
discussion open-ended issues that haven't yet defined what needs to be worked on has questions Reviewer has outstanding questions about the pull request

Comments

@x-N0
Copy link

x-N0 commented Jan 7, 2021

Found out that the web is suggesting some repos that have a latest commit of back 5 years ago. : /

@x-N0 x-N0 changed the title Report repo as dead/unactive Mark/Tag repo as dead/unactive Jan 7, 2021
@x-N0
Copy link
Author

x-N0 commented Jan 7, 2021

This can be done based on the latest commit's date.

@ritwik12
Copy link
Collaborator

ritwik12 commented Jan 8, 2021

@x-N0 We do remove deprecated projects, which are either removed or are archived by the repo owner. I don't think there is a way to judge a project by its latest commit.

Someone can lit a fire if there is a piece of grass with a single matchstick

Saying that even if a project is inactive for a quite long time, someone may make use of it someway. though I understand the fact if some project is not at all in a state of going forward and we can remove that from here once we receive a PR for the same. but doing it in an automated manner will not be good from my side. I am open to suggestions here.

@ritwik12 ritwik12 added discussion open-ended issues that haven't yet defined what needs to be worked on has questions Reviewer has outstanding questions about the pull request labels Jan 8, 2021
@shiftkey
Copy link
Member

shiftkey commented Jan 10, 2021

We use the last-updated field as a proxy for activity on the project, which looks for the most recent "last updated" date for any issue with the relevant label we're tracking on UfG.

https://github.com/up-for-grabs/tooling/blob/ab73ed98057008ca3a6f41b6205b5072416330a6/lib/queries/github_repository_label_active_check.rb#L75-L97

This feels more relevant than looking at the commit time as we're interested in the issues side of a project.

This is also already available through the content we render on the site - you can hover over the count for a GitHub project on the site and see the tooltip about when the project was last updated:

I paused the work there because I didn't really have an opinion about what more we should do with this information, but as @ritwik12 mentions we do aggressively prune projects that are archived/deleted/etc, rather than just quiet.

@matkoniecz
Copy link
Contributor

We do remove deprecated projects, which are either removed or are archived by the repo owner.

What about dead projects, especially ones where PRs wait for years for reviews?

For example this should not be listed and right now is prominently listed:

FsCMS/FsCMS#2 "The CMS needs marketable branding: smile Name Logo" - project is dead

See also https://news.ycombinator.com/item?id=27498602

The idea is great but...

I checked out about 15+ smaller repositories with the "help wanted" labels and only one had active development.

Most projects had several unanswered pull-requests since 2018/2019 contributed by strangers wanting to help or issues with comments from people asking to help with no response.

I actually looked for one project that i could potentially help out a tiny bit, but the experience so far has been discouraging.

@ritwik12
Copy link
Collaborator

@matkoniecz What you say makes sense. We don't want to direct new contributors to the projects where they would feel lost or ignored. That gives a bad user experience due to which people tend to stop contributing to Open Source. I had same experiences so many times.

Do you have a solution or suggestion for this? How should we tackle this.

  • One way we can do this is to track the latest commit on the repo, but there could be a reason like the Repo is hard to work with and the Maintainers are reviewing PRs and Issues but there is nothing useful to be committed for a long time. This happens.

@matkoniecz
Copy link
Contributor

Sort by the latest commit date? That will be put inactive and trickier repositories on the end - with dead projects easier to spot for a manual removal (project without commits but with actual contributions would stay on the list).

@ritwik12
Copy link
Collaborator

@matkoniecz Yes that makes sense, we can have a label/filter to sort by that.

@fundamental
Copy link

My personal opinion is that abandoned repos or repos with sufficiently poor workflow should be aggressively pruned. You want a good experience for the contributors after all. When I've checked random repositories in the past they're very often unmaintained or poorly maintained which makes for a pretty lousy experience. Since git activity is not something that seems to be deemed good enough for determining if a repo is dead I threw together some scripts to parse issue/PR activity per project and score each project with some crude heuristics (limited activity, no issue triage, stale PRs, overwhelming bot activity, etc, etc).

Anyone looking to prune repositories can find plenty of abandoned repos in the attached score file (lower scores are worse). Scores below -4 likely correspond to repos that could be removed after some quick manual inspection, which would remove about 40% of the listed repos. That matches my experience when casually looking through listings. I'm not a contributor to this specific project, so I won't open any PRs trying to remove references to other repos, but I'll leave the score info for anyone interested in figuring out where to draw the line.

repository-issue-tracker-score-sorted.txt

@ritwik12
Copy link
Collaborator

ritwik12 commented Jul 4, 2021

@fundamental Thanks for the information, that makes sense. we can develop scripts based on this and parse all the existing projects on up-for-grabs.net to check for dead/inactive projects per week or few days. and that can be opened as a PR as we do for projects stats updates and archived ones.

I'm not a contributor to this specific project, so I won't open any PRs trying to remove references to other reports,

@fundamental Sure, every single contribution is helpful. If you cant open up a PR maybe someone else will using the details you provided. I will myself try to help based on availability.

@ritwik12
Copy link
Collaborator

ritwik12 commented Jul 4, 2021

@fundamental The scores that you provided, are these the projects hosted at up-for-grabs.net or random projects?

@fundamental
Copy link

The scores are based upon parsing _data/projects/*.yml and using the ['upforgrabs']['link'] field to derive the organization and repository name. If it's preferred I can dump things in the format of "score, _data/projects/source-file.yml", but a quick grep should show the correspondences even if the original contributor named the .yml file something abnormal.

@ritwik12
Copy link
Collaborator

ritwik12 commented Jul 4, 2021

@fundamental Yes, that is good. Thanks for sharing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion open-ended issues that haven't yet defined what needs to be worked on has questions Reviewer has outstanding questions about the pull request
Projects
None yet
Development

No branches or pull requests

5 participants