Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comprehensive test of keyword independence #385

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

karenetheridge
Copy link
Member

@karenetheridge karenetheridge commented Jun 3, 2020

Keywords that are only in effect with instance data of one type should never
interfere with keywords that require a different instance type. In particular,
implementations should not attempt to infer the type of the instance data
based on the keywords used by the schema.

The draft2019-09 data was generated with the script below. Drafts 6 and 7 were slightly adjusted by swapping out 'dependentRequired' and 'dependentSchemas' for 'dependencies'; draft4 removed the use of exclusiveMaximum and exclusiveMinimum because in this draft they require the presence of maximum or minimum. All combinations of keywords appear here, so long as the keywords require a different core data type; for each schema, four sets of data are created, representing the two data types expected by the schema, with a passing and failing test for each.

#!/usr/bin/env perl
use strict;
use warnings;

# data format:
# {
#   $type => { $keyword => { schema => $schema_value, data => [ $failing, $passing ] } },
#   ...,
# }

my $data = {
  number => {
    multipleOf => { schema => 2, data => [ 1, 2 ] },
    maximum => { schema => 2, data => [ 3, 2 ] },
    exclusiveMaximum => { schema => 2, data => [ 2, 1 ] },
    minimum => { schema => 2, data => [ 1, 2 ] },
    exclusiveMinimum => { schema => 2, data => [ 2, 3 ] },
  },
  string => {
    maxLength => { schema => 2, data => [ 'hello', 'hi' ] },
    minLength => { schema => 2, data => [ 'x', 'hi' ] },
    pattern => { schema => 'hi', data => [ 'hello', 'hihi' ] },
  },
  array => {
    maxItems => { schema => 1, data => [ [1,2], [1] ] },
    minItems => { schema => 2, data => [ [1], [1,2] ] },
    uniqueItems => { schema => \1, data => [ [ 1, 1 ], [1] ] },
    items => { schema => \0, data => [ [1], [] ] },
    contains => { schema => \1, data => [ [], [1] ] },
  },
  object => {
    maxProperties => { schema => 1, data => [ {x=>1,y=>2}, { x=>1 } ] },
    minProperties => { schema => 1, data => [ {}, { x=>1 } ] },
    required => { schema => ['x'], data => [ {}, {x=>1} ] },
    dependentRequired => { schema => {x=>['y']}, data => [ {x=>1}, {x=>1,y=>2} ] },
    dependentSchemas => { schema => {x=>\0}, data => [ {x=>1}, {} ] },
    properties => { schema => {x=>\0}, data => [ {x=>1}, {} ] },
    patternProperties => { schema => {hi=>\0}, data => [ {hihi=>1}, {hello=>1} ] },
    additionalProperties => { schema => \0, data => [ {x=>1}, {} ] },
    propertyNames => { schema => \0, data => [ {x=>1}, {} ] },
  },
};

my @tests;

foreach my $type1 (sort keys %$data) {
  foreach my $type2 (sort keys %$data) {
    next unless ($type1 cmp $type2) == -1;                                                                           

    foreach my $keyword1 (sort keys %{$data->{$type1}}) {
      foreach my $keyword2 (sort keys %{$data->{$type2}}) {
        push @tests, {
          description => "$keyword1 + $keyword2",
          schema => {
            $keyword1 => $data->{$type1}{$keyword1}{schema},
            $keyword2 => $data->{$type2}{$keyword2}{schema},
          },
          tests => [
            map {
              my $keyword = $_;
              my $type = $keyword eq $keyword1 ? $type1 : $type2;
              map {
                my $valid = $_;
                +{
                  description => "$type, $keyword ".($valid?'':'in').'valid',
                  data => $data->{$type}{$keyword}{data}[$valid],
                  valid => $valid ? \1 : \0,
                }
              } 0, 1
            } $keyword1, $keyword2
          ],
        },
      }
    }
  }
}

use JSON::MaybeXS;
print JSON::MaybeXS->new(canonical => 1, pretty => 1, indent_length => 4)->encode(\@tests);

@karenetheridge karenetheridge requested review from Julian and a team June 3, 2020 03:13
@karenetheridge karenetheridge marked this pull request as draft June 3, 2020 03:16
@ChALkeR
Copy link
Member

ChALkeR commented Jun 3, 2020

I know at least one impl which thinks that [1, 1] conforms to { required: [], uniqueItems: true }, because of required.

Could you include this test too, please?

@ChALkeR
Copy link
Member

ChALkeR commented Jun 3, 2020

Oh, I missed how large this test is. Let me recheck if it covers that, will update this comment.

Upd: apparently this test does not cover it, but that impl also fails other combinations.

@karenetheridge
Copy link
Member Author

I forgot that boolean schemas weren't allowed in draft4, which the CI tests helpfully spotted. fix squashed!

@ChALkeR

This comment has been minimized.

@ChALkeR

This comment has been minimized.

@karenetheridge karenetheridge force-pushed the ether/keyword-independence branch 2 times, most recently from a54ca4e to c6a2e5e Compare June 3, 2020 04:01
@karenetheridge
Copy link
Member Author

Upd: apparently this test does not cover it, but that impl also fails other combinations.

uniqueItems + required is one of the combinations tested. it starts on line 2450.

@karenetheridge karenetheridge marked this pull request as ready for review June 3, 2020 04:03
@ChALkeR

This comment has been minimized.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 3, 2020

@karenetheridge Ah. I noticed that it was labeled as ready for review 🙂

@karenetheridge
Copy link
Member Author

karenetheridge commented Jun 3, 2020

No, sorry, that was a mistake. I was still running tests and diffs on my end and fixing the last remaining errors from removing boolean schemas for draft4.

@karenetheridge karenetheridge marked this pull request as draft June 3, 2020 04:28
Keywords that are only in effect with instance data of one type should never
interfere with keywords that require a different instance type. In particular,
implementations should not attempt to infer the type of the instance data
based on the keywords used by the schema.
@karenetheridge
Copy link
Member Author

ok, I think this is good to go now. I have run all the files locally and they all check out (as they should, because they are dead-simple.. except for the errors I made swapping out the boolean schemas for draft4) 👯‍♂️

@karenetheridge karenetheridge marked this pull request as ready for review June 3, 2020 04:37
@ChALkeR
Copy link
Member

ChALkeR commented Jun 3, 2020

Seems to work for me (draft2019-09 untested).

Also, this is beneficial, I am aware of implementations which are triggered by these checks.
E.g. http://npmjs.com/is-my-json-valid

@karenetheridge
Copy link
Member Author

@ChALkeR thanks for the review! With large changes like this, especially across multiple drafts, it's quite tricky to ensure there are no mistakes and that the files are reasonably consistent. Even programmatically generating the data like I did this time, I still made several errors when translating between drafts, as you saw :)

With the number of test cases now, and the speed in which new ones are coming in, we are rapidly getting closer to needing an extra verification step where the tests are run across multiple implementations... e.g. #314 .

@ChALkeR
Copy link
Member

ChALkeR commented Jun 4, 2020

For reference: mafintosh/is-my-json-valid#179

@Julian
Copy link
Member

Julian commented Jun 5, 2020

I cant' review a 20000 line PR, but thanks for including the script, that I can review, so if that seems correct this should be fine, will see if I can Perl hard enough to see what it does.

@karenetheridge
Copy link
Member Author

karenetheridge commented Jun 5, 2020

yeah, the patch is a bit big :D

@ChALkeR
Copy link
Member

ChALkeR commented Jun 9, 2020

Could this perhaps include the code to rebuild those files, in some form?

tests/draft4/keyword-independence.json Outdated Show resolved Hide resolved
@karenetheridge
Copy link
Member Author

I'll find the code that generated this and add it (we probably need a new directory to store these sorts of things, since we're doing it in other PRs too now).

@Julian
Copy link
Member

Julian commented Aug 7, 2020

I'll find the code that generated this and add it (we probably need a new directory to store these sorts of things, since we're doing it in other PRs too now).

Yeah @ChALkeR had that good idea -- we should (whatever language is fine probably as long as it's reasonably easy to run. Perl I'm sure counts which is what I think you said you wrote this in originally?)

But as in #414 (comment) can you pick some reasonably small subset of these (5 say) that we'll merge to the regular suite and cover a majority of what these'd cover, and then the rest we document by saying if you want the full comprehensive generated thing run the script you'll add?

@karenetheridge karenetheridge self-assigned this Oct 23, 2020
@karenetheridge karenetheridge mentioned this pull request Jan 9, 2022
2 tasks
@Julian Julian added the waiting for author A pull request which is waiting for an update from its author. label Jun 24, 2022
@karenetheridge karenetheridge marked this pull request as draft June 25, 2022 03:29
@gregsdennis
Copy link
Member

I cant' review a 20000 line PR - @Julian

I've run these on my implementation. They all pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for author A pull request which is waiting for an update from its author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants