Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Summarize "duplicate components" schema error #37

Open
mrutkows opened this issue Jun 8, 2023 · 5 comments
Open

Enhancement: Summarize "duplicate components" schema error #37

mrutkows opened this issue Jun 8, 2023 · 5 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@mrutkows
Copy link
Contributor

mrutkows commented Jun 8, 2023

Extracted this feature request from issue #35

I ran this on an SBOM with 9928 components. There were duplicate components.

	1. Type: [unique], Field: [components], Description: [array items[3,243] must be unique] 
	Failing object: [[
	  {
	    "name": "acl",
	    "publisher": "Guillem Jover <guillem@ ... (truncated)

The message is correct, components[3] and components[243] were duplicates. However, the "failing object" is truncated to show components[0]. But it wasn't related to the message. It would be much easier to read the message if the message showed components[3] instead of components[0].

Look to create special "handlers" for common error types starting with "duplicates". For example, the handler could actually identify and extract the "duplicate" object (the first one) and then better format the error message to show the entire json object.

@mrutkows mrutkows added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Jun 8, 2023
@SamantaTarun
Copy link

I'd like to give a try.

@mrutkows
Copy link
Contributor Author

mrutkows commented Jun 9, 2023

I'd like to give a try.
That would be quite welcome; please let me know if you want to discuss offline.

If you look in cmd/validate.go you will see the function

func FormatSchemaErrors(errs []gojsonschema.ResultError) string {

you can see how we colorize/truncate, etc.

Ideally, if we could produce our own "ResultError" struct type(s), with only the data needed for output (and already pre-pre-processed) then we could create a means to have general "formatters" (interfaces) operate on that reliable data structure which could produce different output formats (see JSON error output format request in issue: #26).

The "pre-processors" would be registered to curate (reduce, simplify, enhance, etc.) the data using the utilities "ResultError" structs (only called based upon field values in the result error struct) before being passed to the output formatters. The first "pre-processor" would handle (be triggered by) the Type=="unique" and perhaps leverage other fields in the struct to determine any special processing (e.g., Field: [components]).

@mrutkows
Copy link
Contributor Author

Please see PR #40 as it introduces an error handling framework and shows how the ItemsMustBeUniqueError has a special handler that reduces output size by only showing the value of the failing duplicate item (as in the unique component error mentioned above).

@mrutkows
Copy link
Contributor Author

Please see if we can use this to identify truly duplicate json (value) objects: https://github.com/mitchellh/hashstructure

@mrutkows mrutkows removed the good first issue Good for newcomers label Jun 27, 2023
@esnible
Copy link

esnible commented Jul 4, 2023

A library might be overkill. A very simple hash function for SBOM components is that component's JSON.

seenComponents = map[string]int{}
for i, component := range ...
   by, _ := json.Marshal(component)
   strComponent := string(by)
   nPrevious, seenBefore := seenComponents[strComponent]
   if seenBefore {
      ... i and nPrevious are duplicates
   }
   seenBefore[strComponent] = i
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants