Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Allow validation of $schema specified in JSON file #310

Open
alex1701c opened this issue Aug 26, 2023 · 6 comments
Open
Labels
enhancement New feature or request needs-investigation Requires study to find an appropriate resolution

Comments

@alex1701c
Copy link

Heyho,
I have created a generic schema that I want to run against json files in different projects. This works great by passing in a schemafile to the validator. But some projects/files could use a more specific schema based on the generic one.

I could specify a $schema in the json contents, but that would get ignored in favor of the schemafile passed in as a CLI argument. In case the user uses a json language server, they would get the proper typehints.

Could the $schema be respected by this cli tool? Maybe even opt-in.

@sirosen
Copy link
Member

sirosen commented Aug 26, 2023

Hi there; I'm happy to see if there's a way to meet your needs with a new feature, but I don't think what you're looking for is $schema.

That keyword is defined rather specifically. It defines the dialect of JSON Schema used by a schema document. It isn't defined for use in any non-schema documents, and I would be very hesitant to use it there.

I think there are other ways we can address this, but I'll need to make sure I understand your use case and think about it more.

Is your situation basically that you want to be able to embed a reference to a schema in your instances, and have each instance validate against its referenced schema?
Is it important that the schema reference comes from inside your documents, rather than some external config file or other source?

@sirosen sirosen added the needs-investigation Requires study to find an appropriate resolution label Aug 27, 2023
@alex1701c
Copy link
Author

That keyword is defined rather specifically. It defines the dialect of JSON Schema used by a schema document. It isn't defined for use in any non-schema documents, and I would be very hesitant to use it there.

Hmm, jsonls also provides extensive autocompletion for using $schema values. (json.schemastore.org). Vscode respects the schema set also. See https://code.visualstudio.com/docs/languages/json#_mapping-in-the-json for the docs. It confirms that specifying it this way is non-standard, but it seems like an established procedure.

Is it important that the schema reference comes from inside your documents, rather than some external config file or other source?

Yeah, a project might have json files for different schemas. Having such info inside the documents will ensure that IDEs provide proper autocompletion without having to manually configure them. Same for the actual linting, where one would otherwise need a mapper-like config file. But those are not standardized in different text editors/IDEs AFAIK.

@sirosen
Copy link
Member

sirosen commented Sep 11, 2023

I asked around a little to see if the spec maintainers have strong feelings about this usage.
The answer is that it's not part of the spec, as noted by VSCode, but that there's no longer much interest in actively discouraging this usage and that tools should feel free to implement it where it makes sense.

My plan is therefore to add a flag for this. Exact name TBD, but my first thought is --schema-from-instances.
Input on the naming is very much welcome, since that's pretty long.

I'll need to take some time to work out behavior for this. There are some details to work out -- e.g. if given multiple files, do you load all of their relevant schemas first, or do you process them one-at-a-time? -- but I'm sure everything is solvable.
Consider the feature request accepted; but I don't have a clear idea of when I'll get around to it.

@sirosen sirosen added the enhancement New feature or request label Sep 11, 2023
@alex1701c
Copy link
Author

Exact name TBD, but my first thought is --schema-from-instances. Input on the naming is very much welcome, since that's pretty long.

How would that flag work in case I run it on files where only some of them have $schema set? And would that flag discard the schemafile that might be provided as another option? (not allowing both to be set could be valid too)

Consider the feature request accepted; but I don't have a clear idea of when I'll get around to it.

Amazing! Thanks a lot :)

@sirosen
Copy link
Member

sirosen commented Sep 11, 2023

How would that flag work in case I run it on files where only some of them have $schema set?

I presume it should fail on those files. The interface would be something like

check-jsonschema --schema-from-instances foo.json bar.json

During such usage -- where everything is fully explicit -- if bar.json doesn't have $schema and foo.json does, I can't see a strong argument for anything other than treating it as an error.

Realistic usage may look more like

check-jsonschema --schema-from-instances myfiles/**/*.json

But that doesn't move the needle significantly. First because that's "just" a shorthand which expands to some explicit argument list. But second, what if the caller meant to specify only files with $schema, but one of them accidentally had it unset? Again, better to error.

And would that flag discard the schemafile that might be provided as another option? (not allowing both to be set could be valid too)

The two would be mutually exclusive, like the other schema selection options which exist today. (--check-metaschema and --builtin-schema)

If I had designed check-jsonschema to have subcommands when I first created it, this would probably be a dedicated subcommand, as would each of the other schema selection modes. Unfortunately, it's a tricky change to make in a backwards compatible way with the current implementation, so I haven't put any significant time against reworking that yet.

@Destroy666x
Copy link

Destroy666x commented Oct 19, 2023

I presume it should fail on those files.

What if you'd like to scan all .json (and .yaml/.toml) files in the repo for available $schema, but check specific schema only those that have $schema?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs-investigation Requires study to find an appropriate resolution
Projects
None yet
Development

No branches or pull requests

3 participants