Add output formatting tests #247

gregsdennis · 2019-01-07T01:51:36Z

In light of draft2019-09's output formatting, I think it might be prudent to have a set of tests that validate the output formatting.

Maybe this can be included in the draft2019-09 tests themselves (when they're added).

Something like:

Example test from draft-07's *properties.json*

{
  "description": "object properties validation",
  "schema": {
    "properties": {
      "foo": {"type": "integer"},
      "bar": {"type": "string"}
    }
  },
  "tests": [
    {
      ...
    },
    {
      "description": "one property invalid is invalid",
      "data": {"foo": 1, "bar": {}},
      "valid": false,
      "outputs": {
        "flag": {
          "valid": false
        },
        "basic":{
          "valid" : false,
          "errors" : [
            {
              "valid" : false,
              "keywordLocation" : "#/properties",
              "instanceLocation" : "#"
            },
            {
              "valid" : false,
              "keywordLocation" : "#/properties/foo/type",
              "instanceLocation" : "#/foo"
            },
            {
              "valid" : false,
              "keywordLocation" : "#/properties/bar/type",
              "instanceLocation" : "#/bar"
            }
          ]
        },
        "detailed": {
          "valid" : false,
          "keywordLocation" : "#/properties",
          "instanceLocation" : "#",
          "errors" : [
            {
              "valid" : false,
              "keywordLocation" : "#/properties/foo/type",
              "instanceLocation" : "#/foo"
            },
            {
              "valid" : false,
              "keywordLocation" : "#/properties/bar/type",
              "instanceLocation" : "#/bar"
            }
          ]
        },
        "verbose": {
          "valid" : false,
          "keywordLocation" : "#",
          "instanceLocation" : "#",
          "errors" : [
            {
              "valid" : false,
              "keywordLocation" : "#/properties",
              "instanceLocation" : "#",
              "errors" : [
                {
                  "valid" : false,
                  "keywordLocation" : "#/properties/foo",
                  "instanceLocation" : "#/foo",
                  "errors" : [
                    {
                      "valid" : false,
                      "keywordLocation" : "#/properties/foo/type",
                      "instanceLocation" : "#/foo"
                    }
                  ]
                },
                {
                  "valid" : false,
                  "keywordLocation" : "#/properties/bar",
                  "instanceLocation" : "#/bar",
                  "errors" : [
                    {
                      "valid" : false,
                      "keywordLocation" : "#/properties/bar/type",
                      "instanceLocation" : "#/bar"
                    }
                  ]
                }
              ]
            }
          ]
        }
      }
    },
    {
      ...
    }
  ]
}

Doing this on all the tests, though, seems like overkill. I think a set of tests specifically for output would be more concise.

vearutop · 2019-04-25T13:10:12Z

I'd like to have sample outputs for *Of, if, then, else and not.

If we have a reference implementation we can easily create validation results for all test cases, which can be helpful before we have a dedicated output test suite.

I think it make sense to stick to basic test suite instead of dedicated output test suite, as it is already mature and has good overall coverage.

gregsdennis · 2022-07-25T23:46:54Z

Now that the current format has been out for a while, and we've received a lot of good feedback about it (which was the intent behind not making it a hard requirement), we're looking to update it for the next iteration of the spec. Discussion can be found here.

The question I'd like to look into here is whether this test suite needs to include tests for the 2019-09 / 2020-12 formats and the new formats (whatever that ends up being). Based on the current test structure, I expect so, but it's still worth asking.

Julian · 2022-07-26T06:23:09Z

I have to review that thread -- it's crossed my eye but I haven't tried to understand the change yet -- regardless I think agree with the point.

whether this test suite needs to include tests for the 2019-09 / 2020-12 formats and the new formats (whatever that ends up being).

I think this is what you're suggesting too, but yeah we should have the 2019-09 and 2020-12 tests include output tests in the form defined "back then", and future drafts in the newer format once it's settled on in that thread.

gregsdennis · 2022-07-28T21:32:56Z

A nuance to this is that I'm thinking of including some language in the next iteration of the spec around implementations only needing to support output from the latest spec version. I think this reduces the burden for implementors quite a bit, but it does mean that a 2020-12 validation could output in the "draft-next" format if "draft-next" is supported.

My concern is that if the 2020-12 output is included, that means implementations which follow this advice won't pass the older format tests.

This also hints at https://github.com/orgs/json-schema-org/discussions/192 regarding deprecating older versions of the spec.

Julian · 2022-07-29T06:03:35Z

That'll potentially be a problem going forward too right? If we put draft-next output in draft-next, an implementation of draft-next-4 may eventually not be able to run them because it again changes the format. We may (or may eventually) just need to make a helper script which back-or-forward ports the output dynamically? But I'm definitely ok to start simple.

gregsdennis · 2022-07-29T06:32:37Z

I'm not sure how comprehensive we want to be with the test suite (e.g. do we want a test that captures whether title was collected as an annotation, or do we just want "this is what annotation output looks like?"), but if we have a 2020-12 output suite, then a 2023-XX output suite, I expect implementors could just grab the ones they conform to.

Sounds like this might also be related to deprecating spec versions and possibly even moving output to its own spec

Julian · 2022-07-29T06:40:49Z

To me probably both, and they do independent things:

"this is what annotation output looks like?"

That tests that an implementation generates the output format correctly from a known set of errors or annotations

do we want a test that captures whether title was collected as an annotation

and that tests that the implementation has properly generated annotations themselves as specified by the spec for each keyword.

I'd do the former before the latter since it should be a smaller amount of work but yeah seems to me we should have both no?

gregsdennis · 2022-07-29T07:35:16Z

Yeah, I agree. Start with "shape of annotations" by having e.g. title tests, but then continue to fill out other keywords over time.

gregsdennis · 2022-08-01T23:54:01Z

I'm starting to play with this a bit, and it seems like we need two kinds of tests:

Output structure - Does the implementation build the expected output structure? These tests can be contrived so as to create significantly (enough) nested output structures so that each format can be checked. For these, we're not checking keywords, per se.
Keyword output - Does each keyword produce the expected annotation or error. Because we don't specify wording for errors, we're not worried about the error message itself, just whether there should be one.

There's obviously going to be some overlap here (and with the main suite), but by separating these out, I think we can create different test files that focus on different things. It'll help keep things straight in our heads at least.

1. Output structure tests

For the output structure tests, I think we can probably have a file structure like

[
    {
        "description": "valid results",
        "schema": {
            "type": "object",
            "properties": {
                "foo": { "type": "string" }
            }
        },
        "tests": [
            {
                "description": "empty object is valid",
                "data": {},
                "flag": { "valid": true },
                "basic": {
                    ...
                },
                "detailed": {
                    ...
                },
                "verbose": {
                    ...
                }
            },
            {
                "description": "object with string foo is valid",
                "data": { "foo": "bar" },
                "flag": { "valid": true },
                "basic": {
                    ...
                },
                "detailed": {
                    ...
                },
                "verbose": {
                    ...
                }
            },
            ...
        ]
    },
    {
        "description": "invalid results",
        "schema": {
            "type": "object",
            "properties": {
                "foo": { "type": "string" }
            }
        },
        "tests": [
            ...
        ]
    }
]

"valid results" and "invalid results" are probably not the best split, but... example.

Essentially, you have multiple instances per schema that produce different annotation/error sets, and you list all of the formats for each one. This is very similar to the existing test files, so it should be familiar.

2. Keyword output tests

For the keyword output tests, we'll again do something similar to the existing tests.

[
    {
        "description": "properties outputs annotations",
        "schema": {
            "$id": "https://json-schema.org/tests/properties",
            "properties": {
                "foo": { "type": "string" }
            }
        },
        "tests": [
            {
                "description": "empty object has no annotations",
                "data": {},
                "basic": {
                    "valid": true,
                    "keywordLocation": "",
                    "absoluteSchemaLocation": "https://json-schema.org/tests/properties#",
                    "instanceLocation": ""
                }
            },
            {
                "description": "object with string foo captures property as annotation",
                "data": { "foo": "bar" },
                "basic": {
                    "valid": true,
                    "keywordLocation": "",
                    "absoluteSchemaLocation": "https://json-schema.org/tests/properties#",
                    "instanceLocation": "",
                    "annotations": [
                        {
                            "valid": true,
                            "keywordLocation": "/properties",
                            "absoluteSchemaLocation": "https://json-schema.org/tests/properties#/properties",
                            "instanceLocation": "",
                            "annotation": [ "foo" ]
                        }
                    ]
                }
            },
            {
                "description": "object with unspecified property captures no annotations",
                "data": { "bar": "foo" },
                "basic": {
                    "valid": true,
                    "keywordLocation": "",
                    "absoluteSchemaLocation": "https://json-schema.org/tests/properties#",
                    "instanceLocation": ""
                }
            }
        ]
    },
    {
        "description": "properties outputs errors",
        "schema": {
            "$id": "https://json-schema.org/tests/properties",
            "properties": {
                "foo": { "type": "string" }
            }
        },
        "tests": [
            {
                "description": "object with number foo is invalid",
                "data": { "foo": 42 },
                "basic": {
                    "valid": false,
                    "keywordLocation": "",
                    "absoluteSchemaLocation": "https://json-schema.org/tests/properties#",
                    "instanceLocation": "",
                    "errors": [
                        {
                            "valid": false,
                            "keywordLocation": "/properties",
                            "absoluteSchemaLocation": "https://json-schema.org/tests/properties#/properties",
                            "instanceLocation": "",
                            "error": "expected a string but found a number"
                        }
                    ]
                }
            }
        ]
    }
]

Here, instead of listing out all of the possible output formats, we only use the basic format because it's the simplest format that gives us annotations & errors.

Again, the focus for these is whether the right data comes out of the keyword rather than the shape of the data.

Having an error message here is a little concerning. For one thing, we're not trying to define any error messaging (the spec explicitly avoids it), but having something here may imply that we're endorsing this messaging as a standard. Secondly, implementations that use different messaging would have a hard time verifying that the output they produce is equivalent to what's expected by the test.

Julian · 2022-08-02T07:32:04Z

Broadly all the above looks good to me. Minor bikeshedding is I'd slightly prefer an extra level of nesting, i.e.:

        "tests": [
            {
                "description": "empty object is valid",
                "data": {},
                "output": {
                    "flag": { "valid": true },
                    "basic": {
                        ...
                    } ,
                    "detailed": {
                        ...
                    },
                    "verbose": {
                        ...
                    }
                }
              }

to keep things a bit better grouped.

Here, instead of listing out all of the possible output formats, we only use the basic format because it's the simplest format that gives us annotations & errors.

I don't know a way around this but this effectively forces implementations into supporting the basic format to use the tests. That's not currently the case right (I'm being lazy and not looking)? By which I mean an implementation can choose to support the detailed format and not basic? If so there's a small argument to be made to use the most expressive format because we could always then downconvert it to basic. But I won't bikeshed this too much so ignore it if need be.

Having an error message here is a little concerning.

Definitely agree with this too. Do we have to include it? Presumably we could just set all of them to "< error message here >" even if so, just not to explicitly bless anything?

gregsdennis · 2022-08-02T08:52:21Z

The spec says this:

An implementation SHOULD provide at least one of the "flag", "basic", or "detailed" format and MAY provide the "verbose" format. If it provides one or more of the "detailed" or "verbose" formats, it MUST also provide the "flag" format. Implementations SHOULD specify in their documentation which formats they support.

It's not stated well, but as is, nothing is really required. Increasing that requirement is part of the change (eventually) for the next version. (The MUST requirement was basically me saying, "if you put in the effort to write output formats, you might as well also add the flag version.)

The reason I picked basic is that, from my experience, most people seem to prefer the list. It's not set in stone, but you do have a good point that using it here kind of implies that it needs to be implemented.

I'd slightly prefer an extra level of nesting

Yeah, this makes sense, but I'd want that in both test sets to reduce modelling needs (especially for strongly-typed languages 😄).

Do we have to include [the error message]?

I think it's important to identify which keywords are expected to have some sort of message, regardless of the wording. I think a placeholder could do to say that we don't endorse any specific wording, but it doesn't resolve the "equality" checking burden.

gregsdennis · 2022-08-04T23:10:12Z

Another thing we're considering is moving the output into its own spec. That would detach the output from the draft version.

How would that impact this suite? Would we have a separate folder for test formatting, or would it warrant a separate repo entirely?

Julian · 2022-08-05T05:49:10Z

Either would be fine with me honestly I think.

karenetheridge · 2022-09-05T18:32:36Z

You're not going to be able to specify the exact structure of the expected result. There are too many ways that an exact comparison will not be correct. Even if you're only expecting a single error, the error message itself cannot be compared because every implementation will have their own. All the other fields (keywordLocation, instanceLocation etc) can be compared exactly though, as there is no room for ambiguity there.

I suggest you indicate the expected result structure not with literal data, but with json schemas. That way you can simply say "the error property must be a string" and leave it at that, and also allow for differences in ordering between errors by having a list of "contains":{...} schemas inside an allOf.

gregsdennis · 2022-09-05T18:49:58Z

Even if you're only expecting a single error, the error message itself cannot be compared because every implementation will have their own.

Yeah, we anticipate this. Not sure how to go about it, but there are some ideas above.

I like the idea of using schemas though.

gregsdennis · 2022-11-23T00:14:27Z

So.... The 2019/2020 output requirements are... not great. I'm not even sure that they're specified to a degree that can be tested.

@Julian what are your thoughts on giving those a pass and introducing output tests for draft-next?

handrews · 2022-11-23T00:24:40Z

They're SHOULDs and MAYs, what's wrong with them?

gregsdennis · 2022-11-23T01:01:22Z

They're SHOULDs and MAYs, what's wrong with them?

Are you suggesting that since they're not MUSTs, tests aren't necessary, and thus agreeing with me that we shouldn't include tests for them?

First and foremost, I don't think that I implemented them 100% right, and I used my implementation while writing the requirements. In trying to write even a few tests, I'm questioning the output that my implementation is giving me as a reference.

I've also been recommending (recently) that people hold off implementing output until draft-next is released.

handrews · 2022-11-23T02:03:01Z

No, I'm saying they're entirely testable, particularly SHOULDs which will at some point get their own directory.

We ought to test the format that is currently actually available.

Julian · 2023-01-04T15:12:46Z

We should call this one closed for now by #619 yeah @gregsdennis? (Reopen obviously if you want to use it to track more work)

Relequestual pinned this issue Jun 5, 2019

gregsdennis mentioned this issue Jun 6, 2019

Initial run at draft/2019-04 tests, complete with output #265

Merged

Julian closed this as completed in #265 Jun 30, 2019

Julian reopened this Jun 30, 2019

gregsdennis mentioned this issue Jul 1, 2019

Second attempt at adding test output #269

Closed

Julian added missing test A request to add a test to the suite that is currently not covered elsewhere. draft support labels Sep 27, 2019

handrews changed the title ~~Draft-08 output formatting tests~~ Draft-08 / 2019-09 output formatting tests Nov 11, 2019

handrews added this to the 2019-09 milestone Nov 11, 2019

Julian changed the title ~~Draft-08 / 2019-09 output formatting tests~~ Draft2019-09 output formatting tests Nov 29, 2019

Julian removed the draft support label Jun 22, 2022

Julian changed the title ~~Draft2019-09 output formatting tests~~ Add output formatting tests Jun 28, 2022

gregsdennis mentioned this issue Nov 23, 2022

initial run at creating output tests #619

Merged

Julian closed this as completed Jan 4, 2023

gregsdennis mentioned this issue Mar 13, 2023

Revamped output testing #657

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add output formatting tests #247

Add output formatting tests #247

gregsdennis commented Jan 7, 2019 •

edited by Julian

vearutop commented Apr 25, 2019

gregsdennis commented Jul 25, 2022

Julian commented Jul 26, 2022

gregsdennis commented Jul 28, 2022

Julian commented Jul 29, 2022

gregsdennis commented Jul 29, 2022

Julian commented Jul 29, 2022

gregsdennis commented Jul 29, 2022

gregsdennis commented Aug 1, 2022 •

edited

Julian commented Aug 2, 2022

gregsdennis commented Aug 2, 2022 •

edited

gregsdennis commented Aug 4, 2022

Julian commented Aug 5, 2022

karenetheridge commented Sep 5, 2022

gregsdennis commented Sep 5, 2022

gregsdennis commented Nov 23, 2022

handrews commented Nov 23, 2022

gregsdennis commented Nov 23, 2022

handrews commented Nov 23, 2022

Julian commented Jan 4, 2023

Add output formatting tests #247

Add output formatting tests #247

Comments

gregsdennis commented Jan 7, 2019 • edited by Julian

vearutop commented Apr 25, 2019

gregsdennis commented Jul 25, 2022

Julian commented Jul 26, 2022

gregsdennis commented Jul 28, 2022

Julian commented Jul 29, 2022

gregsdennis commented Jul 29, 2022

Julian commented Jul 29, 2022

gregsdennis commented Jul 29, 2022

gregsdennis commented Aug 1, 2022 • edited

1. Output structure tests

2. Keyword output tests

Julian commented Aug 2, 2022

gregsdennis commented Aug 2, 2022 • edited

gregsdennis commented Aug 4, 2022

Julian commented Aug 5, 2022

karenetheridge commented Sep 5, 2022

gregsdennis commented Sep 5, 2022

gregsdennis commented Nov 23, 2022

handrews commented Nov 23, 2022

gregsdennis commented Nov 23, 2022

handrews commented Nov 23, 2022

Julian commented Jan 4, 2023

gregsdennis commented Jan 7, 2019 •

edited by Julian

gregsdennis commented Aug 1, 2022 •

edited

gregsdennis commented Aug 2, 2022 •

edited