Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Descriptions for each enum value #47

Open
a-akimov opened this issue Jan 12, 2021 · 20 comments
Open

Descriptions for each enum value #47

a-akimov opened this issue Jan 12, 2021 · 20 comments

Comments

@a-akimov
Copy link

TL;DR Moving the discussion from OAI/OpenAPI-Specification#348 to JSON Schema folks.

Hi All,

For enums, we can now only list the values, for example:

cvv_check:
  type: string
  title: CVV check
  description: |
    When processed, result from checking the CVV/CVC value on the transaction.
  enum:
    - D
    - I
    - M
    - N
    - P
    - S
    - U
    - X

But it's a very common practice to explain what these values are, and if there are any important considerations for each value. Because of the current design, we need to duplicate all these values in the description of the enum, which leads to confusion and mistakes. For example:

cvv_check:
  type: string
  title: CVV check
  description: |
    When processed, result from checking the CVV/CVC value on the transaction.
    * `D` - Suspicious transaction
    * `I` - Failed data validation check
    * `M` - Match
    * `N` - No Match
    * `P` - Not Processed
    * `S` - Should have been present
    * `U` - Issuer unable to process request
    * `X` - Card does not support verification
  enum:
    - D
    - I
    - M
    - N
    - P
    - S
    - U
    - X
  maxLength: 1

It would be much nicer to have either this:

cvv_check:
  type: string
  title: CVV check
  description: When processed, result from checking the CVV/CVC value on the transaction.
  enum:
    D: Suspicious transaction
    I: Failed data validation check
    M: Match
    N: No Match
    P: Not Processed
    S: Should have been present
    U: Issuer unable to process request
    X: Card does not support verification
  maxLength: 1

Or better this:

cvv_check:
  type: string
  title: CVV check
  description: When processed, result from checking the CVV/CVC value on the transaction.
  enum:
    - value: D
      description: Suspicious transaction
    - value: I
      description: Failed data validation check
    - value: M:
      description: Match
    - value: N
      description: No Match
    - value: P
      description: Not Processed
    - value: S
      description: Should have been present
    - value: U
      description: Issuer unable to process request
    - value: X
      description: Card does not support verification

According to https://json-schema.org/understanding-json-schema/reference/generic.html#enumerated-values, the enum keyword is the one that "is used to restrict a value to a fixed set of values". Thus, if somebody wants to find all the enums in a schema (manually or programmatically), currently they can just search for "enum" and get all the enum instances.

If we encourage people to use oneOf+const for enums (as was also suggested in OAI/OpenAPI-Specification#348), it will become problematic to find all enums and also to support in relevant tooling.

  oneOf:
    - const: D
      description: Suspicious transaction

I personally really like the idea of the following format, but not sure what's your reasoning against it:

enum:
    - value: D
      description: Suspicious transaction

In general, this issue seems to be super-important for proper adoption of JSON Schema and OpenAPI formats by the documentation communities. Currently, we have to struggle a lot to keep the description in sync with the actual lists of enum values, and see a lot of problems caused by that (e.g. some values are missing, some exposed by mistake, some have spelling errors or mistakenly follow PascalCase instead of camelCase).

Also, the number of comments, votes and mentiones in the original thread OAI/OpenAPI-Specification#348 really speaks to the idea of this problem being important. Please, consider extending JSON schema to properly support descriptions of enum values.

@notEthan
Copy link

this is discussed previously at json-schema-org/json-schema-spec#57

@a-akimov
Copy link
Author

@notEthan, thanks for the link! Haven't seen it before and nobody was pointing to it :-(

Although now I see all the pros and cons, it still looks like the suggested in 2016 approach with using oneOf+const didn't get enough traction. Maybe one of the reasons is that when someone has just an enum and they simply need to add descriptions to its values, it looks liks quite a big change for them to replace enum to oneOf just for that, just for having descriptions.

Also, currently it's quite easy to find all the enums - you just look for "enum". And if some of my enums are oneOf, then I look both for "enum" and "oneOf", but not all "oneOf" are enums, so I need to build additional logic for that. Looks like it complicates things quite a lot, and having "enum" for all enum instances would be much cleaner.

Please consider if this can have a cleaner solution in a future version of JSON schema.

@gregsdennis
Copy link
Member

Also note that we strive for backward compatibility, and changing enum in this way is a breaking change.

I suggest proposing a new keyword. Or better yet, define said keyword in a vocab.

@karenetheridge
Copy link
Member

I personally really like the idea of the following format, but not sure what's your reasoning against it:

{"enum": [ {"value": "D", "description": "Suspicious transaction" }, .. ] }

The short answer there is enums can be any type, including an object, not just a string -- so what this construct is actually saying is "the value should be a literal object with two properties, 'value' and 'description'". We would need a new keyword to provide enum values and titles/descriptions side by side.. but that's basically duplicating "oneOf", only less generically (which arguably makes it easier for tools to recognize, but also clutters the syntax).

@gregsdennis
Copy link
Member

The other syntax

{ "enum": { "D": "Suspicious transaction", ... } }

is viable because the value is presented as an object rather than an array, and so can be distinguished from the current format. This can be extended to support more metadata as well:

{ "enum": { "D": { "description": "Suspicious transaction", ... }, ... }

though the semantics of this format differs from other keywords that contain subschemas (e.g. properties) in that those subschemas are expected to validate an instance. These subschemas would be required to only have their annotations processed.

I still think that this needs to be defined in an external vocabulary, which would mean that implementation support for it would be sparse, and it definitely should not reuse the enum name.

@Maldris
Copy link

Maldris commented Jan 13, 2021

My big concern with the object form of enum is that it then constrains the types of values that can be enumerated to those that are valid keys in JSON.

i.e. the following would not be a valid schema in most JSON parsers

{
  "enum": {
    [1, 2, 3]: {"description": "some description"},
    ["a", "b", "c"]: {"description": "some other description"}
  }
}

or

{
  "enum": {
    {}: {"description": "some description"},
    {"some": "content"}: {"description": "some other description"}
  }
}

meaning we limit enumerations to only numbers and strings (and some JSON parsers don't support numbers as keys) when we want to use the descriptive enum format.
and while that covers most cases, it does still undercut the efforts to make enum and keywords like it as general as possible.
It also occurs to me that then we are repeating the mistake of the two forms items had, which recently got split out into multiple keywords.

Having another enum like keyword to describe the behaviour would be helpful for a number of cases, especially the above OpenAPI example, and things like UI generation.
The question would then be, what format would such a new keyword take, and should it be part of any core vocabulary, or in a separate vocabulary?

@jdesrosiers
Copy link
Member

The oneOf/anyOf+const+description pattern is sufficient to describe and document an enum. All of the alternatives proposed here require changing enum to have two responsibilities: 1) asserting an enum and 2) documenting enum values. We prefer that keywords have one responsibility and are composed as needed.

The problem is that it's difficult for documentation generators to detect when a oneOf or anyOf is intended to represent an enum. Ideally, there would be a JSON Schema vocabulary for documentation generation that progressively enhances a standard JSON Schema to assist documentation generators.

{
  "docHint": "enum",
  "anyOf": [
    { "const": "a", "description": "A" },
    { "const": "b", "description": "B" },
    { "const": "c", "description": "C" }
  ]
}

docHint is a terrible name, but I think you get the idea. When a documentation generator encounters this flag it knows that this schema represents an enum-like construct and should be processed as such. Validators can ignore it without consequence (progressive enhancement) and enum doesn't have to be redefined in a way that gives it more than one responsibility.

I don't think anyone on our team has experience with documentation generators, but if this option is the direction this proposal goes, we'd be excited to assist anyone who wants to take on the task of defining a JSON Schema vocabulary for documentation generators.

@Simran-B
Copy link

Simran-B commented Feb 4, 2021

I would like to add that oneOf / anyOf + const tends to produce harder to understand validation errors if an invalid value is in the data, whereas enum gives you less errors (because there is no branching) and it's more obvious that the value must be wrong. Could an enum hint perhaps suppress some of the misleading validation errors?

@jdesrosiers
Copy link
Member

That's a good point about validation errors. A hint keyword wouldn't change the standard output, but it could be used when processing the standard output to produce user friendly error messaging.

@handrews
Copy link
Contributor

I have written multiple documentation generators or tools to support them, and I would prefer something like @jdesrosiers's docHint. Typically, I'd encourage a oneOf for this sort of thing as it feels more enum. Granted, enum says its values SHOULD be unique, not MUST, but when working with API docs I'd encourage oneOf.

@handrews
Copy link
Contributor

handrews commented May 5, 2021

@jdesrosiers

That's a good point about validation errors. A hint keyword wouldn't change the standard output, but it could be used when processing the standard output to produce user friendly error messaging.

Are you working on the Understanding JSON Schema update? I feel like this would be a great topic to cover. A major motivation for the output format (both the error and annotation side) was to enable better error reporting for complex structures without the spec having to mandate the specifics.

@jdesrosiers
Copy link
Member

Are you working on the Understanding JSON Schema update? I feel like this would be a great topic to cover.

I am. And I agree.

@gregsdennis
Copy link
Member

One downside to this (and it might have already been mentioned) is that this only supports string-type values in the enum since the values have to be keys. This is contrary to the existing support for enum which allows values of any type.

@handrews
Copy link
Contributor

handrews commented May 5, 2021

@gregsdennis you're referring to the enum object approach, right? Because the docHint approach does not have type limitations AFAICT.

@gregsdennis
Copy link
Member

Yes, the enum object has this limitation. I haven't read through this completely.

@Altreus
Copy link

Altreus commented Jun 15, 2021

An edge case (which I'm good at) is in the implication that a JSON-Schema can be used to infer a form structure, which is a function to which I am actually putting it. By judiciously using x- properties to suggest to the renderer what type of control is appropriate for given fields (with defaults, of course), one can fairly easily use a schema to create an HTML form that satisfies it, at least in simple cases.

(I've basically leant on JSON-Schema as the format for all schema definitions within several systems I've written, since all structured data I've come across so far has fitted it.)

Anyway, there are several form controls that list options for the user to select, and all of them are better if there is a human-readable label to go along with the machine-expected const value. I have been using x-labels next to the enum property, which both the form generator and an auto-documenter can read.

Is there any reason there couldn't be a formal version of this? Similarly, there is no reason I couldn't use a doc hint and a bit of extra logic to turn an anyOf into a select box instead of fieldsets like I currently do, so I'm not strongly invested in this given there's a workaround.

@drumphil
Copy link

I came to a similar idea, but using the term "descriptions" (plural)
So would look something like this
{ "enum": [ "A", "B", "C" ], "descriptions": ["description for A", "description for B", "description for C"] }

@Simran-B
Copy link

@drumphil Your approach doesn't solve the problem stated in the initial post:

we have to struggle a lot to keep the description in sync with the actual lists of enum values, and see a lot of problems caused by that

The descriptions are only linked to the enum values by index. It's easy to mess up and not immediately apparent with a few more values.

@handrews
Copy link
Contributor

Is there any reason there couldn't be a formal version of this?

No, and in fact we created vocabularies so people could formally describe and require 3rd-party extensions. I'm going to move this over to the vocabularies repository where we keep extension ideas. There are also a couple of repos under this org for various specific vocabularies which folks might want to check out.

We don't plan to add this sort of thing to the core standard anytime soon — we added vocabularies so that there could be a way to use keywords interoperably without them having to go in the main spec.

@ionous
Copy link

ionous commented May 18, 2024

I came to a similar idea, but using the term "descriptions" (plural) So would look something like this { "enum": [ "A", "B", "C" ], "descriptions": ["description for A", "description for B", "description for C"] }

vscode uses that method. it calls the entry enumDescriptions ( and markdownEnumDescriptions )

https://github.com/Microsoft/vscode/wiki/Setting-Descriptions

vscode schema example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests