Most obvious CSV data is two years out of date #343

nedbat · 2023-03-03T16:44:40Z

The home page says:

CSV data
The data is available on Google Cloud Storage and can be downloaded via:

web browser: commondatastorage.googleapis.com/ossf-criticality-score/index.html

That page has handy per-language files, but they are dated 2020-12-30. Newer data should be made easier to find, or at least stale data should be removed as an attractive nuisance.

- Fixes ossf#343 Signed-off-by: nathannaveen <42319948+nathannaveen@users.noreply.github.com>

calebbrown · 2023-03-05T20:21:10Z

I've rearranged the objects in the bucket - does this help?

nedbat · 2023-03-05T20:33:04Z

Are the files in "archive" the same 2020 files? It helps in that the old files are now in "archive", but now their dates are 2023, which is itself misleading. Is there a reason to keep the old files at all? Why not produce "top 200" files for current data?

calebbrown · 2023-03-05T20:42:13Z

I've put them in a folder roughly correlating to the date they were originally created.

As for producing "top 200" files for current data - I'm interested in how you might be using these.

I had been leaning towards not producing top-200 sets for each language group, and just supplying a script for producing them locally.

However if the top-200 sets are providing value, I'm more than happy to work on getting these produced automatically.

nedbat · 2023-03-05T21:04:59Z

TBH, I'm new to this data set, and am not sure how I would use the data. I wrote this issue as some feedback from a new user trying to understand the data set. The link from the README sounds enticing, then I am looking at a raw web server page with old files. My suggestion is simply to present the data you value in a way that makes it easy for people to find it and understand it.

nathannaveen added a commit to nathannaveen/criticality_score that referenced this issue Mar 3, 2023

Updated link for CSV data

8703727

- Fixes ossf#343 Signed-off-by: nathannaveen <42319948+nathannaveen@users.noreply.github.com>

nathannaveen linked a pull request Mar 3, 2023 that will close this issue

Updated link for CSV data #344

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Most obvious CSV data is two years out of date #343

Most obvious CSV data is two years out of date #343

nedbat commented Mar 3, 2023

calebbrown commented Mar 5, 2023

nedbat commented Mar 5, 2023

calebbrown commented Mar 5, 2023

nedbat commented Mar 5, 2023

Most obvious CSV data is two years out of date #343

Most obvious CSV data is two years out of date #343

Comments

nedbat commented Mar 3, 2023

calebbrown commented Mar 5, 2023

nedbat commented Mar 5, 2023

calebbrown commented Mar 5, 2023

nedbat commented Mar 5, 2023