Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publishing alternative meta-schemas useful for schema writers #11

Open
jgonzalezdr opened this issue Nov 26, 2018 · 7 comments
Open

Publishing alternative meta-schemas useful for schema writers #11

jgonzalezdr opened this issue Nov 26, 2018 · 7 comments

Comments

@jgonzalezdr
Copy link

Problem description

Often schema writers use a wrong keyword instead of the right one, e.g. using by mistake "minLength" to limit the number of allowed properties or array items, and therefore the intended constraint is not enforced by the schema.

Additionally, also too often schema writers introduce typos in keyword names, e.g. declaring "mininum" instead of "minimum", and again the intended constraint is not enforced.

This kind of mistakes are hard to detect initially, because the schemas validate properly against the meta-schema, and too often the schemas are checked only with "proper" test instances (i.e. they have the expected number of array items), and these test instance of course validate properly as expected.

Of course, testing schemas also against "invalid" test instances helps detect these errors, but generating every single negative test for complex schemas it many times not feasible at all.

Solution proposal

This problem has already been discussed in json-schema-org/json-schema-spec#682 regarding a spec change, but as suggested by @handrews and since the new planned features for draft-08 will allow defining new meta-schemas based on the existing vocabularies (#561), this problem could be solved by publishing in json-schema.org some meta-schemas for vocabularies that are stricter than the "default" ones (and than the specs) that could be used by schema writers.

These "stricter" schemas could be considered recommendations or best practices for schema writers, and as such could even be documented in some annex of the specs, but would otherwise have no other impact on the specs.

You may find here an example of a stricter meta-schema for draft-07 that I use to check schemas.

@gregsdennis
Copy link
Member

gregsdennis commented Nov 26, 2018

The idea of "strict" processing and the idea of using vocabularies to do it are good. I even like the idea of creating a repository of vocabularies.

However, I do have a concern about json-schema.org becoming a repository for vocabularies. Perhaps another site could serve this purpose; maybe something analogous to http://schemastore.org/json/.

@handrews
Copy link
Contributor

The idea of "strict" processing and the idea of using vocabularies to do it are good. I even like the idea of creating a repository of vocabularies.

The vocabularies themselves should be the same, it's the meta-schemas that enforce varying levels of strictness of the usage of those vocabularies.

Vocabularies are "here are the keywords that I am telling you I might use"

Meta-schemas are "here is the structure within which I might use those keywords"

I don't think there's really a notion of a vocabulary being strict or lax, although I'd welcome an example to the contrary. I don't count adding more format values as a strict/lax thing, because the semantics of format itself aren't changed- it's basically an open-ended enum, and adding values just expands the set of known enum values.

The "official" meta-schemas are intentionally the most permissible meta-schemas that prevent blatant syntax errors. I can't remember if there's currently anything that could be made more strict without forbidding schemas that are technically in conformance with the spec. I don't think there is, but I'm not sure.

My thought here would be to continue to publish the existing lax schemas, because we really do need meta-schemas that permit the entire specification to be used. I would also consider publishing a closed-keyword-set version, where we basically just slap "unevaluatedProperties": false on everything.

There are a whole bunch of other gradations you could do, although you start getting into the question of whether using a meta-schema as a linter is really the right option! But the "please tell me if I misspell a property" use case is extremely common.

So beyond those two, I would tend to agree with @gregsdennis that some other site or organization could fill that niche. But I think having those two would be useful.

@jgonzalezdr
Copy link
Author

@gregsdennis However, I do have a concern about json-schema.org becoming a repository for vocabularies. Perhaps another site could serve this purpose; maybe something analogous to http://schemastore.org/json/.

I agree with you, json-schema.org should not be a repository for vocabularies or meta-schemas.

Nevertheless, json-schema.org is the entry point for most people learning JSON schemas, so it could be convenient in the "learn" section of the web to add information regarding some useful alternative meta-schemas for the official vocabularies that can be used by schema writers to avoid common pitfalls, even if these meta-schemas are managed somewhere else.

@jgonzalezdr
Copy link
Author

@handrews I don't think there's really a notion of a vocabulary being strict or lax, although I'd welcome an example to the contrary.

In fact the notion is really there, and the JSON Schema specs are very lax. Take in account that a vocabulary, in addition to semantics, has some structural definition (the grammar), even if minimal.

An example that the vocabulary could be stricter is in json-schema-org/json-schema-spec#682, where I proposed a stricter vocabulary by making a stricter grammar, by making the type keyword mandatory and adding some presence dependencies between keywords and the instance type.

Also JSON Specs could be more lax. for example not making the type of some keywords mandatory (e.g. properties could take any value type) and indicating that when taking some types the behavior is not defined (e.g. properties can take array values, but the behavior in such case is not defined).

In the likes of what happens with programming languages (some are more strict than others, for example regarding variable type strictness), stricter approaches prevent more user errors and narrow down deterministic performance at runtime, but also are harder to implement.

In any case, I think the current laxness/strictness level of the JSON Schema specs is good 😉.

@handrews
Copy link
Contributor

@jgonzalezdr

An example that the vocabulary could be stricter is in json-schema-org/json-schema-spec#682, where I proposed a stricter vocabulary by making a stricter grammar, by making the type keyword mandatory and adding some presence dependencies between keywords and the instance type.

Using your definitions from the basic vocabulary issue, I would consider this a strict dialect rather than a strict vocabulary. type and the dependent keywords still have the same semantics as in the "standard" dialect.

Also JSON Specs could be more lax. for example not making the type of some keywords mandatory (e.g. properties could take any value type) and indicating that when taking some types the behavior is not defined (e.g. properties can take array values, but the behavior in such case is not defined).

I don't really follow this part, but since you're talking about the incompatible syntax having undefined behavior, it's still just a dialect (I'm going to take the view that if you allow more flexible syntax to pass validation but don't define behavior, then the semantics are unchanged- this is effectively what happens when people skip validating schemas against meta-schemas and put in an array value of properties- maybe you get a useful error, maybe you don't).

In any case, I think the current laxness/strictness level of the JSON Schema specs is good

😁

@jgonzalezdr
Copy link
Author

@handrews Regarding laxness/strictness, I was not talking specifically about JSON Schema "$vocabularies", but more about vocabularies in a more abstract level. Meta-specification? 😵 I mean, the way the JSON Schema specification is designed is pretty lax, but it could have been by design stricter or laxer.

Now getting back to the JSON Schema vocabularies discussion, you're right, the stricter vocabulary that I described can be a dialect, but it could also be "upgraded" to a vocabulary just by defining a URI for it.

I do not agree with the laxer vocabulary, it can never be a dialect because the validation vocabulary rules for properties is defined, and it does not allow anything which is not an object. However, it can become a brand new vocabulary, incompatible with the original one, if somebody writes the specs for it (e.g. indicating that implementations shall just issue a warning and skip the keyword if the type is not supported) and defines an URI for it.

In practical terms: a dialect can be defined just by modifying a meta-schema that references the original vocabulary in $vocabulary; a vocabulary however is defined by writing its specs (which can be very simple if it just extends, constraints or relaxes another vocabulary) and defining a URI for it, even if it's just done in an application-specific private context.

As a consequence, a dialect could be considered "ill" if schemas validated by the dialect meta-schema are not valid according to the original vocabulary/ies rules. It's a spec design decision to indicate if implementation must detect this situation and notify an error, or just indicate that the behavior in such cases is not defined. Both options are indeed valid, since it is the meta-schema author's responsibility to ensure that is is correct and has sense.

At the end, dialects or vocabularies that are defined on top of other vocabularies in a semantically compatible way will form vocabulary families.

@handrews
Copy link
Contributor

I'm going to bump this out of draft-08. We can experiment with what else we want to do with vocabularies after that along with everyone else in the JSON Schema community.

@philsturgeon philsturgeon transferred this issue from json-schema-org/json-schema-spec Jan 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@handrews @gregsdennis @jgonzalezdr and others