Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] A more flexible project format #1552

Open
shiftkey opened this issue Oct 14, 2019 · 0 comments
Open

[RFC] A more flexible project format #1552

shiftkey opened this issue Oct 14, 2019 · 0 comments
Labels
discussion open-ended issues that haven't yet defined what needs to be worked on

Comments

@shiftkey
Copy link
Member

This is probably the most radical change to come to Up for Grabs in many years, and I wanted to lay out some context about why this is important, where my thinking is currently heading, and how I plan to experiment to confirm this is worth spending time on.

Some History

There's been a number of ideas and requests on the issue tracker, some opened years ago, which were limited by how we structured the project file:

"Would be nice to promote a query with 'simple' issues for a repo (we do not have too many yet marked with 'simple', but it's coming ...)." - #578

"It is possible to add additional tags for up-for-grabs issues, to make a distinction between things we are just accepting help for, versus things we think would be good first contributions?" - #344

"I have up-for-grabs labels in two repos, so I'd like to have two links. I guess I could add a whole separate "ConfigR Samples" entry but it would be nicer if both links were somehow shown against the single ConfigR entry." - #244

For context, the important parts of the project (where users can find issues) have been limited to these two fields:

upforgrabs:
  name: up-for-grabs
  link: https://github.com/up-for-grabs/up-for-grabs.net/labels/up-for-grabs

We did this early on for convenience - it was easy to render a link on the site - but this approach was problematic:

  • limits of this format - more on this later
  • confusion about the right or optimal link to use, which requires reviewer to check and provide feedback
  • needed to ensure the name label was actually correct

Enabling future ideas

In addition to the limited support of the current format, various features have been blocked as not being possible to tackle:

I want to keep these cases in mind with this new project format, and now that we have project stats being updated periodically using GitHub Actions I think these are now things we can entertain and plan to support, if we get the format right.

What would the new project format require?

Before walking through examples, I wanted to explain the information required. I've settled on these two fields as a minimum for each project.

  • label - this seems to be a commonality across all projects we've seen
  • source - the location where the project is hosted, and depending on the source additional criteria may be needed

Current source values I've had in mind for this experiment:

  • github - projects hosted on GitHub
  • gitlab - projects hosted on GitLab
  • classic - a way to port projects that aren't covered by the previous two sources

GitHub metadata

If you specify the github source, you are then required to provide one of these additional values:

  • repo - for projects hosted on a single repository
  • organization - for projects which span multiple repositories, using a common label

And these fields would be optional:

  • exclude - an array of labels to omit from any results - some projects use labels to track the state of issues, to indicate they are not available for new contributors

GitLab metadata

If you specify the gitlab source, you are then required to provide one of these additional values:

  • repo - for projects hosted on a single repository
  • group - for projects which span multiple repositories, using a common label

And these fields would be optional:

  • exclude - an array of labels to omit from any results - some projects use labels to track the state of issues, to indicate they are not available for new contributors

Classic metadata

For the classic source, you are then required to provide one of these additional values:

  • url - a link to the list of tasks that are available to new contributors

I have a prototype that uses JSON Schema to validate these new fields (yes, JSON and YAML aren't the same thing, but I'm able to validate the schema to the deserialized Ruby object, it'll be fine), and I think that combined with friendly error messaging would make this easy to maintain all the projects currently in use.

Migrating projects to the new format

Here's what the Up for Grabs listing itself would look like with this new format:

# before
upforgrabs:
  name: up-for-grabs
  link: https://github.com/up-for-grabs/up-for-grabs.net/labels/up-for-grabs
# after
projects:
  - source: github
    label: up-for-grabs
    repo: up-for-grabs/up-for-grabs.net

Less noise, more clarity. But this is a case we're definitely managing well. What about something more complex, like NativeScript:

# before
upforgrabs:
  name: help wanted
  link: https://github.com/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22+user%3ANativeScript
# after
projects:
  - source: github
    organization: NativeScript
    label: help wanted

This is where the new format really comes into it's own, because instead of carrying around the complex URL we can apply some conventions when building up the URL (either on the client or when querying via the API) and strip the project file back to the important bits.

What about that example about being able to provide links to two different projects? We've not been able to support that, but with the request in #244 we can now represent it:

#before
upforgrabs:
  name: up-for-grabs
  link: https://github.com/config-r/config-r/labels/up-for-grabs
#after
projects:
  - source: github
    repo: config-r/config-r
    label: up-for-grabs
  - source: github
    repo: config-r/config-r-samples
    label: up-for-grabs

Projects hosted on GitLab can benefit from this approach:

# before
upforgrabs:
  name: Easy
  link: https://gitlab.com/openstreetcraft/openstreetcraft-api/issues?label_name%5B%5D=Easy
# after
projects:
  - source: gitlab
    repo: openstreetcraft/openstreetcraft-api
    label: Easy

I need to figure out how to work with the GitLab API to generate stats, but having a consistent format here will definitely help too.

We have some groups on GitLab too (these seem to mirror organizations on GitHub) and these could also be simplified significantly:

# before
upforgrabs:
  name: Accepting merge requests
  link: https://gitlab.com/groups/gitlab-org/-/issues?state=opened&label_name[]=Accepting%20merge%20requests
# after
projects:
  - source: gitlab
    group: gitlab-org
    label: Accepting merge requests

The goal with these approaches is to separate the data from the implementation of how we get the data:

  • navigating to the list of issues for the project
  • using an API to find stats for a project
  • using an API to discover interesting trends about projects

By extracting these things out of the project schema, we can start to explore more of these possibilities.

For example, we have a project hosted on Launchpad but we don't currently know how to query for it's stats. That can be ported to our classic

# before
upforgrabs:
  name: bitesize
  link: https://bugs.launchpad.net/evergreen/+bugs?field.tag=bitesize
# after
projects:
  - source: classic
    label: bitesize
    url: https://bugs.launchpad.net/evergreen/+bugs?field.tag=bitesize

If we find ourselves able to interact with an API for this in the future, we could convert it to it's own source and validate it like we do other things:

# after
projects:
  - source: classic
    label: bitesize
    url: https://bugs.launchpad.net/evergreen/+bugs?field.tag=bitesize
# future?
projects:
  - source: launchpad
    project: evergreen
    label: bitesize

What feedback am I looking for right now?

Before going too far into the implementation of this, I'm looking for feedback in two areas:

  • are there projects that you're aware of that might not fit in this format?
  • are there scenarios or cases to support that are worth exploring at this stage?

I want to try and support as many different projects as possible, but I want to be able to experiment with live data as soon as possible, rather than fleshing out this format too much.

Next Steps

I have a few questions that I'd like to answer sooner rather than later:

  • how will this change affect the site, and in particular listing the details of each project?
  • how does the caching of stats fit into the new schema changes?
  • how does reviews of new projects affect this? Can we automate more of the review work using schema validation and better integrate it into GitHub?

Things I'm not worried about currently:

  • migrating projects - I've already got code lying around to extract the label from each project, and GitHub API integration, so I'm fairly confident already that I can safely port most of these projects to the new format
  • client-side code - if we make the new project format additive initially I think we can fallback to the current behaviour in a bunch of places to save time
@shiftkey shiftkey added the discussion open-ended issues that haven't yet defined what needs to be worked on label Oct 14, 2019
@shiftkey shiftkey pinned this issue Oct 14, 2019
@shiftkey shiftkey unpinned this issue Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion open-ended issues that haven't yet defined what needs to be worked on
Projects
None yet
Development

No branches or pull requests

1 participant