Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to interpret simple schemas that contains additionalProperties #12

Open
jonaslagoni opened this issue Oct 8, 2021 · 7 comments
Open

Comments

@jonaslagoni
Copy link
Member

We are gonna focus on the very basic schemas which has additionalProperties defined.

How do we want to interpret additionalProperties?

Defining additionalProperties allow people to provide additional properties beyond what properties define.

This means that the following schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/idl-schema",
  "name": "SomeTitle",
  "type": "object",
  "properties": {
    "StringProperty": {
      "type": "string"
    }
  },
  "additionalProperties": {
    "type": "string"
  }
}

Validates correctly against:

{ "StringProperty": "value1", "key2": "value2" }

And we need to figure out how we want to give access to key2, if we want that of course.

I see two options

From a data definition standpoint, I see two options for how we can interpret it.

  1. We ignore the properties completely and don't define a location for it (easiest to do and represent, no side effects, other than you cannot access the properties).
class SomeTitle {
   StringProperty: string;
}
  1. We provide an additionalProperties property for the "class", which uses the appropriate dictionary with a key of type string and value of type, whatever additionalProperties in the schema resolve.
class SomeTitle {
   StringProperty: string;
   additionalProperties: Map<string, string>;
}

Side effects of options 2

If we choose option 2, we have a couple of extra things we need to discuss:

  1. In case the schema defines a property with the name additionalProperties we need to decide how we want to handle such duplication of properties.
{
  "$schema": "https://json-schema.org/draft/2020-12/idl-schema",
  "name": "SomeTitle",
  "type": "object",
  "properties": {
    "additionalProperties": {
      "type": "string"
    }
  },
  "additionalProperties": {
    "type": "string"
  }
}
  1. If these classes are to be serialized/deserialized so they can be validated against the very same schema they were generated from, we will encounter some behaviors we need to address, mainly for serializing.
    1. When serializing, you want to unwrap the additionalProperties property, which can cause clashes of properties, say I have the very same property defined under additionalProperties as one of the core properties. Should we simply ignore such properties?
    2. When serializing, we will need to write our own behavior, as many standard libraries won't support the complex unwrapping, which can cause great complexity, which we need to be aware of.
@jonaslagoni jonaslagoni changed the title How to interpret simple schemas with additionalProperties How to interpret simple schemas that contains additionalProperties Oct 8, 2021
@jdesrosiers
Copy link
Member

My view is that from the type system's point of view addtionalProperties only constrain how a type can be extended and additional properties are not expected to be accessed. Types in a typical type system are open. If you have a function that takes a "Foo" it will take anything that derives from "Foo" as well. I'll explain with some TypeScript.

type Foo = { a: string };
function doSomething(n: Foo) {
  console.log(foo.a.endsWith("a")); // foo.a is a string
}

const aaa: Foo = { a: "a" };
doSomething(aaa); // No type error

const bbb = { a: "a", b: "b"};
doSomething(bbb); // No type error

"bbb" is a Foo with additional property "b" and "doSomething" will happily except it. The type system will complain if you try to use it, but there's no problem with it being there. The equivalent of Foo with "additionalProperties": false would be if there were a type error on that last line. As far as I know, there's no way to make TypeScript do that. In Java, you could use the final keyword to disallow a class from being extended. There would be no way to instantiate a variable that has additional properties, making it effectively equivalent to "addtionalProperties" false.

I think it's reasonable for this to extend to additionalProperties with a schema other than false. If the Foo schemas has "additionalProperties": { "type": "string" }, that would mean that you can only compose it with a schema that adds only string properties. From the perspective of a type system, you could only extend type Foo with properties that are strings. So, additionalPropeties doesn't describe additional values that you can access in Foo, it constrains what types can be derived from Foo.

So, for type generation, my suggestion would be to ignore additionalProperties entirely.

If people want to expose additional properties as a property with an object map, this is where an idl-vocab keyword could be useful to allow people to specify that that's what they want and the name of the property where they can access the object map of additional properties.

@jonaslagoni
Copy link
Member Author

Hmm, that is definitely another way to look at the keyword 🧐

It is only because TS assume any type by default.

const bbb: Foo = { a: 'a', b: 'b'}; // Type error

I see additionalProperties differently though, as I don't see it as extending a type but rather almost like a type on its own 😄

One of the use-cases for additionalProperties is to do something like:

{"name": "root", "properties": {"accounts": { "additionalProperties": {... Account info object } } } }

which enables data such as:

{ "accounts": { "account-1": { ... } } }

Where you really want to have access to the accounts to iterate over them.

There is also a huge difference between when additionalProperties is defined without properties keyword, since if additionalProperties is defined on it's own we could assume:

type Root = { 
  accounts: { [key: string]: Account } 
};

Otherwise, if properties are defined like this:

{"name": "root", "properties": { "accounts": { "name": "Accounts", "properties": { "prop": {...} }, "additionalProperties": {... Account info object } } } }

It becomes difficult to render the map of accounts, without specifying a property that contains the accounts in some way such as:

type Root = { accounts: Accounts };
type Accounts = { prop: ..., additionalProperties: { } };

If one would not want the object to contain extra properties why would one specify anything other than additionalProperties: false? This is the reason why I suggest we expect a property for any extra properties by default and not the other way around 😄 However, I don't think that there is a "right" default here, and it, therefore, might be better to do your suggestion 🤔

But I agree, we need a keyword to control this behavior regardless. Any suggestions? 😅

@chriskapp
Copy link

Hey, since @jonaslagoni asked me to provide some feedback, here some general thoughts how we handle this case at TypeSchema, which might be also useful for your case.

In our logic, if a schema contains an additionalProperties keyword then it represents a map. A map is basically a container for a specific type. I.e. if we take a look at the simple schema:

{
  "type": "object",
  "additionalProperties": {
    "type": "string"
  }
}

In Java we try to convert this schema to HashMap<string, string>, in go we use map[string]string and in TypeScript Record<string, string>. So there is also no option to combine the properties and additionalProperties keyword. If you use the properties keyword it represents a struct containing a fixed set of properties and if you use additionalProperties it represents a map. If we take a look at an more advanced example:

{
  "type": "object",
  "additionalProperties": {
    "$ref": "My_Type"
  }
}

In this case we would generate a Java type like HashMap<string, My_Type>, to be fair this would not work directly in your case since JsonSchema uses JSON Pointer to reference a different schema, at TypeSchema we have moved away from this and we can only reference types under the definitions location and because of this we can directly use the name as reference, which then makes it also really easy to generate the fitting reference.

So as general suggestion, if we try to build a schema for code generators I can only recommend to make some hard restrictions on the flexible nature of JsonSchema, otherwise you end up with cases which are too complicated to handle for code generators and which also limits the target languages which can implement such solutions so "keep it simple stupid". In the end we have also chosen this path for TypeSchema.

@jdesrosiers
Copy link
Member

It is only because TS assume any type by default.

No, it's because that's how polymorphism works. It works like this in every OO language. I could have defined a type that extends Foo and used that instead of an any type.

type Foo = { a: string };
type Bar = Foo & { b: string };

function doSomething(n: Foo) {
  console.log(foo.a.endsWith("a")); // foo.a is a string
}

const aaa: Foo = { a: "a" };
doSomething(aaa); // No type error

const bbb: Bar = { a: "a", b: "b"};
doSomething(bbb); // No type error

doSomething happily takes any type that extends Foo. The additional property added by Bar is there and ignored. The type system will complain if I try to access "b" in doSomething, but it doesn't matter that it's there. Here's the same behavior in Java.

class Foo {
  String a;

  public Foo(String a) {
    this.a = a;
  }
}

class Bar extends Foo {
  String b;

  public Bar(String a, String b) {
    super(a);
    this.b = b;
  }
}

class Main {
  public static void main(String[] args) {
    Foo aaa = new Foo("a");
    doSomething(aaa); // No type error

    Bar bbb = new Bar("a", "b");
    doSomething(bbb); // No type error
  }

  private static void doSomething(Foo foo) {
    System.out.println(foo.a.endsWith("a")); // foo.a is a string
  }

When you do something like const bbb: Foo = { a: 'a', b: 'b' }; in typescript, it's like using a constructor that includes all the properties of the type. That code doesn't work because the constructor doesn't allow additional properties, not because the type doesn't allow additional properties (it does). In the Java example, I can't create a Foo with additional properties because the constructor doesn't allow it, but I can still have a Foo with additional properties by doing this: Foo myFoo = new Bar("a", "b");

One of the use-cases for additionalProperties is to do something like: [object map]

I see this as an abuse of addtionalProperties, but I would be ok with this being considered a special case. If it only has additionalProperties and no properties, the whole thing can be considered a Map. The proper way to define a Map is to use patternProperties with a pattern that allows anything. Either way, object maps are a special case and I think a different discussion.

Otherwise, if properties are defined like this: [..] It becomes difficult to render the map of accounts, without specifying a property that contains the accounts in some way

It's difficult to express a type for this because it's not a usable data structure. Imagine what code that works with this type would look like. To get at the accounts, you would have to iterate over the whole object while ignoring known properties. It's not our responsibility to change their data structure to make it make sense by adding an artificial property. They should get what they ask for.

@jdesrosiers
Copy link
Member

@chriskapp Welcome! Glad to have you in the conversation!

Essentially, you side-step the problem we're trying to solve here by not allowing addtionalProperties and properties in the same schema. That's not a choice we can make, but it ends up being very similar to what I'm suggesting. addtionalProperties ends up getting ignored (from the code gen perspective) if there are any properties defined.

@gregsdennis
Copy link
Member

I'm on the fence between "ignore it" and "create a map-type property." But I don't think that the inheritance model holds up, except that additionalProperties: false would result in a sealed/final type.

It was touched on before, but I'll reiterate. If we have additionalProperties: {type: string}, then only subclasses that define string properties could be created, which doesn't fit into the polymorphic paradigm. Polymorphism doesn't put any restrictions on what a derived type can define, even sometimes to the point of redefining already-existing properties (see C# new modifier). Restricting properties on derived types to a single type is not really useful.

I think the best we could do for the polymorphic case is to support additionalProperties: false as sealed/final. Maybe additionalProperties: true as an abstract type (explicit base type to build upon; can't be instantiated).

@jdesrosiers
Copy link
Member

I agree that additionalProperties: { type: string } is not representable by most, if not all, typesystems. In that case, it would be ignored by code-gen. Code-gen is always going to be a best approximation anyway. Not everything in a schema is going to be able to be expressed as a type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants