Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul structs, refactor JSON parser and saver #133

Merged
merged 1 commit into from
Apr 26, 2022
Merged

Overhaul structs, refactor JSON parser and saver #133

merged 1 commit into from
Apr 26, 2022

Conversation

ianling
Copy link
Collaborator

@ianling ianling commented Apr 11, 2022

Refactors the JSON parser/writer so it no longer uses reflect. The new approach is incomparably more maintainable and less error-prone, and requires substantially fewer unit tests.

I rewrote the unit tests for the JSON parser and saver to be very simple. I manually translated the official JSON example document from here into a Go struct: https://github.com/spdx/spdx-spec/tree/development/v2.2.2/examples.

  • Parser test: parses the official JSON example document and compares the result to my handwritten value.
  • Writer test: converts the known-good handwritten struct to JSON, then parses the JSON back into a Go struct and ensures that it matches the handwritten struct.

There are a number of breaking API changes here (I changed some of the structs to align more with the spec). Maybe once full support for all the file types included in the spec is added, it warrants an API-breaking major version release (v1.0.0)?

File type support at time of writing:

@ianling ianling marked this pull request as draft April 11, 2022 21:11
@ianling ianling marked this pull request as ready for review April 12, 2022 00:11
@ianling
Copy link
Collaborator Author

ianling commented Apr 12, 2022

@swinslow @CatalinStratu -- I think this is ready for review. Sorry again for the size of the change, but I think it is simplest to refactor the entirety of the structs and make all the existing parsers/writers work with the refactored structs all in one go.

@swinslow
Copy link
Member

Thank you @ianling! I've been away from my desk for the past several days -- I will take a look at this within the next few days.

@ianling
Copy link
Collaborator Author

ianling commented Apr 20, 2022

@swinslow -- to clarify, each of my PRs build off of the others, so each one in the chain needs to be fully reviewed and merged before we can start talking about the next one in the chain. In other words, we have to wrap up reviewing and merging this one before we can move on to #134, and then we have to get that one cleaned up and merged before moving on to the next one.

If necessary, I will rebase the PRs after each one is merged.

@swinslow
Copy link
Member

Understood -- thanks @ianling for all your hard work on this and the other PRs.

Apologies that I've been delayed in reviewing. I'm hoping to get through this one over the weekend, and also planning to email the spdx-tech mailing list to give folks a heads-up about the API changes that would be occurring for 0.4.0 and going forward.

Separately, I've been chatting with a couple of folks about maintainership of the Golang tools going forward, so longer-term I'm hoping to bring in a few additional folks to help the Golang tools move forward and not be bottlenecked on me. Given the extensive changes in these PRs, assuming they land I expect it would make sense to see if you'd be open to participating in that as well. Can discuss down the road but just mentioning.

I'll try to get through this in the next few days and to either merge it or else circle back with any concerns. Thank you again!

Copy link
Member

@swinslow swinslow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ianling, this is fantastic and was clearly a ton of work. Thanks for all the changes which I agree will make things much easier to maintain.

I've reviewed in some detail, and I have a few specific bugs / changes that I think we may want to make before merging this PR. I've got some future ones down the road which are less important and we can wait till later to handle those. Let me know your thoughts on the following, and whether these are things that you could address in the PR before we merge it:

  1. tvsaver: Note that the outputted Tags for certain fields need to align with the existing field names in the SPDX spec for tag-value documents, even where these are different from the JSON field names. The following are singular for the tag-value field names, even if the JSON field names are plural:
  • for both saver2v1 and saver2v2, in save_file.go: FileType, LicenseInfoInFile, and FileContributor should each be singular, not plural, for what gets printed as the tag name (see clause 8 of the spec)
  • with corresponding reversions for those tags in the save_file_test.go and save_package_test.go files
  1. json: When testing one of the examples (8-jsontotv), I notice that the resulting tag-value file has double prefixes for all the IDs. For example, I'm seeing SPDXID: SPDXRef-SPDXRef-DOCUMENT, ExternalDocumentRef: DocumentRef-DocumentRef-spdx-tool-1.2 ..., etc. I'm not sure whether this is being caused by the changes to the json loader or the tag-value saver, but it looks like one or the other is causing a problem here.

  2. json: Similarly, when testing example 9-tvtojson which uses the tag-value loader / json saver, the opposite problem is happening: some ID prefixes are being omitted. E.g., "SPDXID":"DOCUMENT" instead of "SPDXID":"SPDXRef-DOCUMENT", etc.

  3. json/json_test.go brings in github.com/r3labs/diff/v3, which is MPL-2.0 licensed and appears to have a lot of subdependencies. Can we leave out json_test.go for now from this PR, and re-evaluate whether there's an alternative library we could use for the diffing for this test? If it's necessary for us to bring in subdependencies, we can do so, but I'm ideally trying to keep them as limited as possible (especially where they are under non-permissive licenses).

If you're able to address these, then I'm good with merging the PR after that. Or if you think these can be easily handled after we merge this PR, I'm fine with that too. Thank you again!

@ianling
Copy link
Collaborator Author

ianling commented Apr 25, 2022

Thanks a lot for your time @swinslow!

re: point 1 above: I think that was an accident with my IDE's refactoring tools; it refactored instances of LicenseInfoInFile (for example) not only where it was being used as a struct field, but also wherever that was used as a string. Thanks for catching that. This is fixed now.

For point 2 and point 3, I ended up fixing those in #137. I ran the tests and examples in that PR and it now works as expected there.

Agreed on point 4, I have dropped that dependency and replaced it with https://github.com/google/go-cmp. Is that okay? Or should we just drop the test entirely/comment it out for now? I would prefer not to have to manually check each individual field of the struct as that would be extremely tedious.

Signed-off-by: Ian Ling <ian@iancaling.com>
Copy link
Member

@swinslow swinslow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ianling! This looks like it takes care of my concerns.

For point 4, yes, go-cmp looks fine to me -- BSD-3-Clause and just bringing in one other sub-dependency which is also BSD-3-Clause. That should be fine to work with this project's license (choice of either Apache-2.0 or GPL-2.0-or-later).

I'm going ahead and merging this. I haven't reviewed the subsequent PRs yet, feel free to tag me in whichever order they should be added and I'll take a look.

Also cc'ing @lumjjb for visibility as he and I were chatting about this earlier today.

Thanks again for all your hard work here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request large
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants