GSoC: `bowtie-trend`: Long-Term Reporting With Bowtie #607

Julian · 2024-01-31T16:30:44Z

Project title

Long-Term Reporting With Bowtie

Brief Description

The JSON Schema test suite is a collection of JSON schemas, instances (data) and expected validation results -- meaning information about what result an implementation of JSON Schema is expected to produce. These "known correct results" are used by a huge number of implementations of JSON Schema in order to ensure they behave correctly under the JSON Schema specification.

The recent Bowtie tool was written to compare results of running this suite across many implementations. It produces a report, accessible at https://bowtie.report, showing how many tests pass or fail on each implementation.

But how does that information change over time, as implementations fix bugs, or as new tests are added to the test suite?

Let's write a way to track compliance numbers over time, such that we can graph or query how the results change!

Refs: bowtie-json-schema/bowtie#34

Expected Outcomes

A new bowtie trend command which can aggregate together results from Bowtie, producing a trend report
A page on the website which uses this trend report to produce graphs of failed tests over time

Skills Required

Comfort with Python as well as Typescript
Basic data processing skils
Fundamental knowledge of JSON Schema validation

Mentors

@Julian

Expected Difficulty
Medium

Expected Time Commitment
350 hour

Related issue in the JSON Schema org repo: #607

The text was updated successfully, but these errors were encountered:

benjagm · 2024-02-22T09:19:23Z

Thanks for your interest! Lets continue the discussion in this issue inside the JSON Schema project: #607

Akshaybagai52 · 2024-02-22T12:32:24Z

Hey @Julian @benjagm
I would love to work on this task under GSOC 2024 would love to discuss more about this
Thanks

Julian · 2024-02-22T15:42:38Z

Welcome folks.

Here are some suggestions on how to get started:

Before anything else, you're highly encouraged to read through some basic JSON Schema materials, which you can find on the JSON Schema website. A good way to test yourself is to think about whether you understand the basics of what JSON Schema is used for, and what a tool that implements it does at a high level.

After that, you can have a quick read through Bowtie's documentation, though I wouldn't read it in depth, and it isn't fully comprehensive (if you see specific gaps, you're welcome to send PRs to improve any bit of it!)

For Bowtie itself then:

there are a number of issues tagged with good first issue, but many or most are already in progress. If you see one that isn't, you're welcome to try it! Ask as well on the issue if you're unsure or need clarification.
Keep an eye on the above as I add more as they get solved
Beyond the above, a large general area of improvement is UI tests! We don't have very many. You can find the few we do have and see if you can think of any additional small, self-contained additional tests we can add!

Good luck!

alexialg05 · 2024-02-26T21:49:49Z

@Julian @benjagm I would love to work on this project during GSOC 2024, are
there any starting contributions I can beggin with that you can get me assigned, some of them seemed to be taken,
looking foward to work with you!

benjagm · 2024-02-27T11:03:14Z

Thanks a lot for joining JSON Schema org for this edition of GSoC!!

Qualification tasks will be published as comments in the project ideas by Thursday/Friday of this week. In addition I'd like to invite you to a office hours session this thursday 18:30 UTC where we'll present the ideas and the relevant date to consider at this stage of the program.

Please use this link to join the session:
🌐 Zoom
📅 20124-02-29 18:30 UTC

See you there!

ashmit-coder · 2024-03-05T18:21:33Z

Hey @Julian is there no specific Qualificaion task for this project or is it still under process?

Julian · 2024-03-05T18:26:14Z

The task is #607 (comment) -- i.e. "get the project up and running, fix an outstanding issue, or if you don't see one, send a PR adding a UI test"!

sd1p · 2024-03-08T18:27:19Z

Hey! @Julian @benjagm, I wanted to express my interest in working on this project. I'm proficient in TypeScript and Python, and I've already taken the time to familiarize myself with the codebase.

officeneerajsaini · 2024-03-09T06:11:36Z

Hey @Julian, I am interested in the #607 project; it looks great to me, and I want to contribute. I've already read about what a contributor needs for this project. First, I need to learn about JSON Schema documents, try to implement and test some validation. After that, I need to go through the Bowtie documentation. Skills needed include Python and TypeScript, basic data processing skills, and understanding the fundamentals of JSON Schema.

Why am I better for this project? I have over one year of MERN Stack experience, having completed several projects. I am proficient in both Python and TypeScript. Additionally, I hold a badge as a Postman API Fundamentals Student Expert, showcasing my understanding of data processing. Furthermore, I've merged three pull requests in Node.js and created more than 4-5 pull requests or issues in JSON Schema. I plan to learn JSON Schema and Bowtie projects soon.

adwait-godbole · 2024-03-15T19:19:11Z

Hello @Julian I have been thinking on some ideas of how we can plan to tackle this project :)

Would like to share some insights over here:-

as implementations fix bugs, or as new tests are added to the test suite?

I assume here that bug fixing likely happens in different releases of the implementation right ? So we can perhaps think of having a flag called -i on the bowtie trend command and also likely having a -d flag for the dialect. Once we get these two values for e.g. in this command bowtie trend -i python-jsonschema -d 2020, the ideal choice would be to have images for the different release images of python-jsonschema already deployed on ghcr.io pacakges ig ? (here i think we can default to having images for the latest 10 / 15 releases ? since running this command is going be computationally expensive! more thought would be required here as it might give the feels of pulling multiple docker images from #918 :). We can use tags on these deployed release images to uniquely identify them just like how we currently have a latest tag we can have for e.g. v4.18.3 tag on an image to tell the implementation library release and similarly tags like v4.18.1, v4.18.0, etc. corresponding to different releases.

We could perhaps also think of having one single image that is responsible for running the test suite on different releases of the implementation library but I am not quite sure of how to do that. From the first thought it feels like writing code inside each implementation to run tests on different releases of the library which seems tougher than having multiple docker images targeting different releases. Also a catch here is that the latest 10 / 15 releases might not even have significant changes in the failedTests perhaps ? When I gave this some more thought we can also keep the release images starting from the very first release to 10 / 15 after that ? (That might or might not include the current latest release though) but we could potentially see significant changes in failedTests ig ?

For the above I am also likely missing out on a fact that it might happen that some release of an implementation wasn't supporting the dialect user passed. I think we just skip in that case ? but if so then how do we even come to the conclusion that we should skip ? Is it hardcoding of supported dialects inside the code for each release image separately ?. Would love to hear out your thoughts on this one.
If at all I am on the right track with the above and if at all we manage to successfully accomplish all of that, then how exactly are we going to display the output of running for e.g. bowtie trend -i python-jsonschema 2020 is a big question. One thought I had was in the below snippet (just some random dummy numbers I've inserted here):

Again the above snippet doesn't really show the exact tests that diffed in for e.g. v4.18.3 and v4.18.4. I guess for doing that we can default the command bowtie trend -i python-jsonschema 2020 (without --format pretty) to output various
{{ implementation }}/{{ draft }}/{{ version }}.json files inside a directory called trend-report perhaps ? and those JSON files with not just hold those 4 stats but all of the JSON test results data for the different releases just like how we output data in for e.g. bowtie suite but in this case it will be for multiple releases and we can then use those files with bowtie diff for diffing between let's say

bowtie diff trend-report/python-jsonschema/2020-12/v4.18.3.json trend-report/python-jsonschema/2020-12/v4.18.4.json

Once we are able to successfully handle all the above, likely we can fallback on using GitHub workflows to generate these trend reports across implementations for different releases and for different dialects as well. We can likely do so just like how we are doing it right now in the report.yml below

we could do something similar for bowtie trend as well :). Again we will have to be cautious here since we are in the workflow environment we will have to avoid again running into #918.

Once we have these reports generated then for the frontend we can likely use Chart.js to plot line graphs. If for example the user visits https://bowtie.report/#/implementations/python-jsonschema/trend-report then we can use a line graph over there and we can have a dropdown on that graph to first select the dialect version from (2020, 2019, 7,etc.) and then select for either of erroredCases, erroredTests, skippedTests and failedTests. Whatever the user selects in that dropdown we can easily pull all those values for example for failedTests from the generated trend report for that dialect and plot them out on different releases of the implementation. By this I mean to say the X-axis will represent different python-jsonschema relase names and the Y-axis would be the failedTests number for those releases and we simply generate a line graph having the legend of the dialect version. If the user selects erroredTests then we simply change the values of Y-axis to correspond to erroredTests and the graph changes dynamically according to that. Below is a snippet for better visualization.

If we want the user to have the functionality to be able to actually query the diffed results for e.g. between python-jsonschema/2020/v4.18.3 vs python-jsonschema/2020/v4.18.4 then we can basically show a first dropdown that selects the draft version and then show two dropdowns lets say X and Y. If for X the user selects v4.18.3 and for Y if the user selects v4.18.4 then we simply get the diffed results between these 2 releases for that draft (again doing this its not really straight forward since we don't have a backend server this means again defaulting to GitHub workflows and performing bowtie diff on the different output JSON files produced by bowtie trend but this will result in a lot of permutations and combinations of doing bowtie diff ). So that only leaves us with one choice where we perform a diff dynamically on the frontend itself and output something similar to what bowtie diff outputs on the terminal.

A lot of things I've said above but this project is indeed a challenging one and would require more thoughts! Would like to hear out your opinions on my above thoughts here @Julian .

benjagm · 2024-03-18T09:36:17Z

🚩 IMPORTANT INSTRUCTIONS REGARDING HOW AND WHERE TO SUBMIT YOU APPLICATION 🚩

Please join this discussion in JSON Schema slack to get the last details very important details on how to better submit your application to JSON Schema.

See communication here.

Julian · 2024-03-18T15:12:26Z

Nice! Good thoughts! Indeed, there's lots of choices we can make here, and probably we'll try to get however much done as we can.

the ideal choice would be to have images for the different release images of python-jsonschema already deployed on ghcr.io pacakges

Yes, definitely starting to move from only having a latest release to tagging releases-by-release (i.e. by implementation release version) seems like a good step!

One thought I had was in the below snippet (just some random dummy numbers I've inserted here):

Probably worth experimenting with a few different designs, but I suspect having rows for each metric would be an immediate improvement over showing the JSON objects there.

Once we are able to successfully handle all the above, likely we can fallback on using GitHub workflows to generate these trend reports

I'm sure GitHub workflows will play in here, but keep in mind the current workflow is mostly meant to stay up to date on new releases -- if we anyways have a mechanism for understanding new releases the need to continuously re-run the report decreases on old versions -- in other words we might not need to run it on an ongoing basis for old releases, maybe just the new ones.

Like the UI thoughts as well!

All in all think you touched on lots of the key areas to think about!

adwait-godbole · 2024-03-18T15:45:49Z

Thanks a lot @Julian for your review! Means a lot as this is my first time participating in GSoC! Gives me more confidence that I am on the right thought process here!

Probably worth experimenting with a few different designs, but I suspect having rows for each metric would be an immediate improvement over showing the JSON objects there.

Yes agreed.

I'm sure GitHub workflows will play in here, but keep in mind the current workflow is mostly meant to stay up to date on new releases -- if we anyways have a mechanism for understanding new releases the need to continuously re-run the report decreases on old versions -- in other words we might not need to run it on an ongoing basis for old releases, maybe just the new ones.

Yes I've proposed this thought in my GSoC proposal, forgot to write it down over here. But definitely we will have to find a way so that the GitHub workflows only runs the jobs on new releases and is already aware of the old releases that it had run earlier on.

Like the UI thoughts as well!
All in all think you touched on lots of the key areas to think about!

Thanks! I've mentioned a few more details and UI mockup in my GSoC proposal which I've sent you over Slack.

Julian added the gsoc Google Summer of Code Project Idea label Jan 31, 2024

Siddharth-Singh-2004 mentioned this issue Feb 22, 2024

GSoC: New Bowtie Test Case Widgets #609

Open

Julian mentioned this issue Feb 28, 2024

Add support for longer term trend reporting bowtie-json-schema/bowtie#34

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC: `bowtie-trend`: Long-Term Reporting With Bowtie #607

GSoC: `bowtie-trend`: Long-Term Reporting With Bowtie #607

Julian commented Jan 31, 2024 •

edited by benjagm

benjagm commented Feb 22, 2024

Akshaybagai52 commented Feb 22, 2024

Julian commented Feb 22, 2024

alexialg05 commented Feb 26, 2024 •

edited

benjagm commented Feb 27, 2024

ashmit-coder commented Mar 5, 2024

Julian commented Mar 5, 2024

sd1p commented Mar 8, 2024

officeneerajsaini commented Mar 9, 2024

adwait-godbole commented Mar 15, 2024 •

edited

benjagm commented Mar 18, 2024

Julian commented Mar 18, 2024

adwait-godbole commented Mar 18, 2024

GSoC: bowtie-trend: Long-Term Reporting With Bowtie #607

GSoC: bowtie-trend: Long-Term Reporting With Bowtie #607

Comments

Julian commented Jan 31, 2024 • edited by benjagm

benjagm commented Feb 22, 2024

Akshaybagai52 commented Feb 22, 2024

Julian commented Feb 22, 2024

alexialg05 commented Feb 26, 2024 • edited

benjagm commented Feb 27, 2024

ashmit-coder commented Mar 5, 2024

Julian commented Mar 5, 2024

sd1p commented Mar 8, 2024

officeneerajsaini commented Mar 9, 2024

adwait-godbole commented Mar 15, 2024 • edited

benjagm commented Mar 18, 2024

Julian commented Mar 18, 2024

adwait-godbole commented Mar 18, 2024

GSoC: `bowtie-trend`: Long-Term Reporting With Bowtie #607

GSoC: `bowtie-trend`: Long-Term Reporting With Bowtie #607

Julian commented Jan 31, 2024 •

edited by benjagm

alexialg05 commented Feb 26, 2024 •

edited

adwait-godbole commented Mar 15, 2024 •

edited