Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future of schema generators #101

Open
domoritz opened this issue May 31, 2019 · 109 comments
Open

Future of schema generators #101

domoritz opened this issue May 31, 2019 · 109 comments

Comments

@domoritz
Copy link
Member

domoritz commented May 31, 2019

This is a discussion about the future of Typescript JSON schema generators.

TL;DR

@domoritz
Copy link
Member Author

From @HoldYourWaffle

I have created issues for most of the problems I encountered in YousefED/typescript-json-schema. I had to switch to that module because conditional types (particularily Omit) are not supported yet.

But there's something else I want to discuss. There are currently 3 modules that accomplish basically the same goal (at least that I know of), and they all have a lot of issues.
vega/ts-json-schema-generator seems to be the best overall solution, featuring clean(er) code and proper type alias support. However, it doesn't support conditional types (#100), which means Omit, Pick, Exclude etc. arent' supported either (#71, #93). This makes this module completely unusable for me because I heavily use these language features. YousefED/typescript-json-schema does support these constructs, but it contains a lot of bugs (I have reported 8 so far). There's also xiag-ag/typescript-to-json-schema, but I have no idea what the difference is between this module and it's 'ancestor' (the README only says that this is an 'extended' version).

I see 3 possible solutions for this forkmania (though there could be more):

Fix YousefED/typescript-json-schema. This can sortof be seen as the 'default' option, as it would leave the current situation mostly untouched. This isn't my preferred solution because the architecture of vega/ts-json-schema-generator looks a lot better and it doesn't really solve the forkmania.

Write a new generator from scratch, using the knowledge and experience from the other three. If we were to do this we'd of course have the amazing power of hindsight and shared knowledge, which would allow us to write a clean, well-designed and future proof generator. However, this would be very time-consuming and since vega/ts-json-schema-generator is already very well designed I don't see much reason to do this.

Merge all the good parts into vega/ts-json-schema-generator and 'opening it up' for general usage. I think this would be the best option because it will actually fix the fork issue without consuming a lot of time and effort. I think this module currently has the best/cleanest code and I don't see a reason for using one of the others if we increased flexibility and supported more use cases than what vega is doing.

If you're interested I can write up a more detailed overview in a couple of hours. No matter what option you/we choose I'd love to help/fix/develop/maintain.

@HoldYourWaffle
Copy link

Good to see there's interest in my proposal! Perhaps it would be good to pin this issue so more people will see this and hopefully voice their opinion on the matter?

@domoritz
Copy link
Member Author

Thank you for the comments. I will use this issue to explain some of the different philosophies behind the different libraries.

The goal of all of these libraries is to convert Typescript to JSON schema. There are different approaches to achieving this and also different interpretations that you can follow. One fundamental issue is that JSON schema and Types are not equivalent. Some things are not expressible in the other.

YousefED/typescript-json-schema

This was the original schema generator that I picked for my main use case, which is Vega-Lite. The philosophy of this library was to be flexible and configurable for different use cases. This means that some configurations work better than others because optimizing all of them is complex.

It worked fairly well once I extended it a bit to deal with some of the more complicated types we use. However, I constantly ran into issues with getting meaningful properties that correspond to aliased types. The fundamental problem is that this library uses the type hierarchy, and not the AST to create the JSON schema. Therefore I went on to look for a different library, which was xiag-ag/typescript-to-json-schema.

I stopped active development but occasionally review PRs and make releases. I consider this library to be in maintenance mode.

xiag-ag/typescript-to-json-schema

This library was mostly written by @mrix and uses the AST to create the schema. It is also much more modular. Rather than having one big file, there are separate modules for parsing and code generation for all different types of AST nodes and types. I had to extend it significantly to make it work for Vega-Lite. That's what vega/ts-json-schema-generator is.

vega/ts-json-schema-generator

This is the extended version of xiag-ag/typescript-to-json-schema. Besides extending the supported AST nodes, I also changed some of the behavior because it was not correct for some of my use cases. Some of these improvements have since been ported back into the original library but not all.

Overall, this library is much more robust than YousefED/typescript-json-schema and produces generally better schemas. It is also much more opinionated. I don't want to have options but instead, make particular design decisions and bake them into the code.

Even though this library still has some missing functionalities, it works in production for Vega-Lite, which is a complex piece of TypeScript. I keep fixing this library if it is necessary for Vega-Lite but otherwise don't have the cycles to do anything else.

I am happy to review PRs if they have sufficient tests, don't introduce options, and provide useful additions (I will reject support for functions because there is no clean mapping to JSON schemas).

I consider this library to be in active development and would be more than thrilled to have people help with it.

Conclusion

I have considered writing a new generator based on the things I have learned. Ideally, it would be a bit leaner than vega/ts-json-schema-generator and not have as many files. However, I don't have the cycles to do it and I'm not convinced that it will be much cleaner. Overall, I think investing resources into vega/ts-json-schema-generator would be the best way forward. I am happy to review PRs and make timely releases for this library. However, my number one priority is to support Vega-Lite and anything that is in conflict with that goal will be rejected (I don't see this as an issue, though).

Let me know what you think.

@domoritz domoritz pinned this issue May 31, 2019
@HoldYourWaffle
Copy link

Since your comment is pretty big I'm just going to respond per section.


One fundamental issue is that JSON schema and Types are not equivalent. Some things are not expressible in the other.

This is of course true. I think the best way to handle this would be to have a section in the README that clearly lists all constructs that are not or only partially supported. It's annoying to discover something you need is unsupported, but discovering it after you've already started using an automated generator setup is way worse (I can unfortunately speak from experience).

Maybe we should also look into a way to manually override generated the generated schema in a fluent way. This way any shortcomings of the library can be manually filled in or corrected. In my own projects I've been using a script that manually changes the generated JSON object but this is very inflexible and error prone. I haven't really thought about how to implement such a mechanism, perhaps a JSdoc annotation with a JSON pointer could be something? I'll think about it.


This means that some configurations work better than others because optimizing all of them is complex.

I'm not sure I understand what you mean by this. Are you trying to say that more options → more complexity → hard to get working correctly? I agree that having more options might increase complexity, but there should always be a sensible default behavior. Having more options to override common sense because it's assumptions doesn't match with your use case is a good thing in my opinion.


It is also much more opinionated. I don't want to have options but instead, make particular design decisions and bake them into the code.

I'm not sure I agree with you on this. Having sensible defaults is always a good thing, and making some assumptions when designing something like this is necessary, but I don't see how this would hinder adding options. Could you give an example of an option you've rejected/don't want to add? I'm probably just misunderstanding what you're saying.


I keep fixing this library if it is necessary for Vega-Lite but otherwise don't have the cycles to do anything else.
I am happy to review PRs if they have sufficient tests, don't introduce options, and provide useful additions (I will reject support for functions because there is no clean mapping to JSON schemas).
I consider this library to be in active development and would be more than thrilled to have people help with it.

I'd love to help you maintain this project if you don't have the time for it! Again I'm not sure why you'd want to reject new options, could you clarify what you mean by this?
And out of curiosity: why do people want to send functions over JSON? There's no function type in the JSON spec, nor can I think of a usecase where one would want to. If you really wanted to put functions in your JSON you can use Function.toString with eval on the other side, but this is pretty unsafe in almost all cases.


Ideally, it would be a bit leaner than vega/ts-json-schema-generator and not have as many files. However, I don't have the cycles to do it and I'm not convinced that it will be much cleaner.

This is the main reason why I think a new generator isn't the best option. Is there a reason why we can't make vega/ts-json-schema-generator leaner without rewriting the whole thing? Also, what's wrong with having more files? It makes it a lot easier to find what you're looking for, as well as leaving less room for weird global state anti-patterns.


Overall, I think investing resources into vega/ts-json-schema-generator would be the best way forward. I am happy to review PRs and make timely releases for this library. However, my number one priority is to support Vega-Lite and anything that is in conflict with that goal will be rejected (I don't see this as an issue, though).

I also think this is be the best way forward. I don't see a reason why there would be conflicts with Vega, since the goal of a general purpose library is to support most (if not all) usecases, Vega included of course.


If we decide to adopt this strategy there's one more issue remaining: uniting the modules/forks to fix the current forkmania. I think YousefED/typescript-json-schema could just be deprecated with a nice forward to this module (as soon as it's ready of course, mainly looking at conditional types).

xiag-ag/typescript-to-json-schema is a different story though. I found this PR by you that aims to merge the 2 repositories together, but as you already know there hasn't been any response in 2 years. It seems like @mrix has disappeared from the community, which of course doesn't help our case.

The main reason why I want the forks to be united is that the current situation is really confusing. Last week I basically went like this:

  • typescript-json-schema, has the most downloads and looks pretty solid
  • typescript-to-json-schema, mentions that it uses a superior approach in the README which apparently fixes type aliases? Seems like this is the best choice then
  • ts-json-schema-generator, is a fork of the previous superior one? What's the difference? There's like 200+ additional commits so I think it's better? Why are all these names so similar?!
    And it only got worse when I discovered that you have contributed to both and recommend both to users of the other.

Removing YousefED/typescript-json-schema from the equation would help a lot. We may never get a response from @mrix, but we can remove this repository's forked status or add a clear explanation in the README on what the differences are and why you should probably use this module instead of the upstream one.


On a completely different note, maybe it's a good idea to create an issue in YousefED/typescript-json-schema referencing this issue and pin it there too. Since that module has more users we're probably going to get more responses then.

@domoritz
Copy link
Member Author

One fundamental issue is that JSON schema and Types are not equivalent. Some things are not expressible in the other.

This is of course true. I think the best way to handle this would be to have a section in the README that clearly lists all constructs that are not or only partially supported. It's annoying to discover something you need is unsupported, but discovering it after you've already started using an automated generator setup is way worse (I can unfortunately speak from experience).

The issue here really is JSON schema and not Typescript. I have found good ways around missing things in typescript such as maxLength but the other way around it much messier. For example, JSON schema neither supports union types or inheritance properly.

This means that some configurations work better than others because optimizing all of them is complex.

I'm not sure I understand what you mean by this. Are you trying to say that more options → more complexity → hard to get working correctly? I agree that having more options might increase complexity, but there should always be a sensible default behavior. Having more options to override common sense because it's assumptions doesn't match with your use case is a good thing in my opinion.

More options increase the number of paths through the code and make it harder to get right. I am definitely against adding more code paths just to support another use case. Instead, the defaults should be good.

As an example, including or not including aliases or not using a top-level reference have implications on many parts of the code. I can speak from experience that not having the config options avoided a bunch of headaches.

I am not against the ability to configure things that have only implications on localized pieces of the code.

And out of curiosity: why do people want to send functions over JSON? There's no function type in the JSON spec, nor can I think of a usecase where one would want to.

I agree and still there we got PRs and issues for it: https://github.com/YousefED/typescript-json-schema/issues?utf8=%E2%9C%93&q=functions+

Is there a reason why we can't make vega/ts-json-schema-generator leaner without rewriting the whole thing?

No. I think the current design is good enough and I don't see any reason to rewrite.

I think YousefED/typescript-json-schema could just be deprecated with a nice forward to this module

It's already in maintenance mode and I think that's what it should be. A forward link sounds good to me but then I want support with issues that people report ;-)

xiag-ag/typescript-to-json-schema has a few features that we should get working in this fork. See #63

On a completely different note, maybe it's a good idea to create an issue in YousefED/typescript-json-schema referencing this issue and pin it there too.

Go ahead. I will pin it. I already added a note to https://github.com/YousefED/typescript-json-schema#background. Before we can deprecate the other library, we need support for conditionals here.

@HoldYourWaffle
Copy link

The issue here really is JSON schema and not Typescript. I have found good ways around missing things in typescript such as maxLength but the other way around it much messier. For example, JSON schema neither supports union types or inheritance properly.

You really did a good job expressing stuff like minLength! It's very fluent and it just makes a lot of sense. It's also self-documenting by definition, which is always a good thing.
I'm not sure why union types would be a problem, isn't that just oneOf? Inheritance is a common issue with JSON schema, but since this is an automated tool it's probably not that bad to have some duplication in the output schema (perhaps adding a description field with the original information would be good for clarity/readability?).

It should also be noted that without additionalProperties: false it's perfectly possible to express inheritance using allOf, so if we really wanted to support a form of inheritance preservance this could be an option, but I think this will get needlessly complex very quickly. Since this is an issue with the JSON schema spec itself and not with this module I think a clear explanation on why this isn't possible in the README would suffice for now.


More options increase the number of paths through the code and make it harder to get right. I am definitely against adding more code paths just to support another use case. Instead, the defaults should be good.
As an example, including or not including aliases or not using a top-level reference have implications on many parts of the code. I can speak from experience that not having the config options avoided a bunch of headaches.
I am not against the ability to configure things that have only implications on localized pieces of the code.

That makes a lot of sense. So something like --strictTuples (is it the default yet?) or --strictNullChecks (however misleading it may be) would be fine? I agree that configuring the overall shape of the schema will get messy very quickly, but as far as I can see there's no reason why options for individual constructs should be rejected.


I think YousefED/typescript-json-schema could just be deprecated with a nice forward to this module

It's already in maintenance mode and I think that's what it should be. A forward link sounds good to me but then I want support with issues that people report ;-)

That makes sense, but is there really a reason why someone would want to use the other module once we include all missing feature here? It's always good to keep providing support for something, but is it really worth the effort if this were to be a (practically) drop-in replacement?


xiag-ag/typescript-to-json-schema has a few features that we should get working in this fork. See #63

I'd love to help, but I can't figure out what new features we're missing since there are so many changes in the vega version. If you can give me some kind of list I'd be more than happy to take a look.


Go ahead. I will pin it.

Done. I also created an issue in @XriM's repository in case there are more lost souls like I was.

@HoldYourWaffle HoldYourWaffle mentioned this issue Jun 1, 2019
7 tasks
@domoritz
Copy link
Member Author

domoritz commented Jun 1, 2019

I'm not sure why union types would be a problem, isn't that just oneOf

I meant intersection types. allOf does not work because of additionalProperties: false as you noted in inheritance as well.

So something like --strictTuples (is it the default yet?) or --strictNullChecks (however misleading it may be) would be fine?

Yep. I think my request to keep the number of code paths low is reasonable and I think you agree.

That makes sense, but is there really a reason why someone would want to use the other module once we include all missing feature here?

In the future, yes.

I'd love to help, but I can't figure out what new features we're missing since there are so many changes in the vega version. If you can give me some kind of list I'd be more than happy to take a look.

See master...xiag-ag:master.

@HoldYourWaffle
Copy link

Yep. I think my request to keep the number of code paths low is reasonable and I think you agree.

Of course! I was just wondering where your "line" was on what's too complicated, glad to hear it's in a very reasonable place.

See master...xiag-ag:master.

I tried to look through it, but I fear I'm just not well versed enough in the codebase to know what changed, what hasn't been done here already and what is even applicable to our version. Maybe we could copy over the tests that were added, see which ones fail and go from there? I'd love to try it but I'll have to figure out the test infrastructure first, which is of course going to take some time.

I meant intersection types. allOf does not work because of additionalProperties: false as you noted in inheritance as well.

The more I use JSON schema the more I think "How have they not solved this yet?". I think from a spec perspective there are 2 logical solutions:

  1. Ignore additionalProperties in an allOf clause. I can't think of a use-case where this would be actual useful behavior.
  2. Allow additional keys on a $ref "schema" to supplement/override the reference.

These "ideas" aren't very useful from a generator perspective of course since no validator supports them. The only solution I see is to duplicate & merge the inherited and inheriting schemas but this would probably get really messy really quickly (both in the code and in the output). Maybe adding a description to the frankensteined output would help? I feel like there should be a better solution though...
How is this currently handled?

@domoritz
Copy link
Member Author

domoritz commented Jun 1, 2019

How is this currently handled?

I merge the objects in the allOf into one big object.

@HoldYourWaffle
Copy link

Are there any issues with this approach apart from messy output?

@domoritz
Copy link
Member Author

domoritz commented Jun 1, 2019

If you want to generate code from the schema again, the information about intersections and inheritance is lost.

@HoldYourWaffle
Copy link

That makes sense. Maybe adding some kind of note to the schema could help with that?

@ForbesLindesay
Copy link

The problem with deprecating YousefED/typescript-json-schema is that it's the only one that handles conditional types properly. Supporting them requires doing type inference that is on a par with TypeScript itself. Without using getTypeAtLocation, it is incredibly difficult to keep up with the fast pace of TypeScript language improvement.

I think there could be enormous value in refactoring typescript-json-schema to be more modular, and paring down the list of options to reduce the complexity of the many code paths. There also could be merit to using an AST first approach - i.e. do what we can via traversing the AST, where alias refs will work really well, and only fall back to getTypeAtLocation for complex/generic types.

I would be quite interested in taking on this task (I already maintain typescript-json-validator which is a wrapper around typescript-json-schema), but didn't want to before because I don't want to either:

  1. I don't want to contribute to there being "yet another fork".
  2. I don't think the AST based approach is likely to bare a lot of fruit.

Having said that, I have also started work on typeconvert, which aims to do type inference on babel ASTs to convert between TypeScript and Flow (and generate documentation, JSON Schema etc.) Unfortunately it's nowhere near ready for release yet though as I keep realising I've made a mistake and need to fundamentally refactor.

@domoritz
Copy link
Member Author

Thank you for your message. I agree that a hybrid approach might work well but it's hard to say until we have a working implementation. For me personally, my concern is that I can generate schemas for Vega-Lite. Maybe you can use that as a test case for another implementation of a hybrid schema generator?

@HoldYourWaffle
Copy link

I think a hybrid approach could definitely be a good solution. I don't see how this would contribute to the 'yet another fork problem', since I think this can just be integrated into one of the existing ones (preferably this one of course)? I don't know much about the internals of either code base though so I might be completely wrong here.

Supporting them requires doing type inference that is on a par with TypeScript itself

Perhaps it's possible to reuse the logic TypeScript itself is using? Visual Studio Code has (in my experience) near perfect "reflection" on TypeScript code so it should be possible. I think VS Code uses something like this, maybe that's something worth looking into? Again I'm really not qualified to make any well founded argument about this but I try to help as much as I can.

@sparebytes
Copy link

sparebytes commented Jun 26, 2019

Maybe we can do something like this: ast -> io-ts -> json schema.
io-ts does a good job representing types at runtime. Looks like their v2.0 milestone includes generating json schema.

Excuse me if this is way off base, I can't read the whole thread right this minute.

@kayahr
Copy link
Contributor

kayahr commented Jun 26, 2019

According to this table mapping types, conditional types or even specific types like Exclude or Omit are not supported in io-ts. So for converting ast to io-ts we still have to do all the complex mapping/condition resolving as we do it now. So nothing gained here in my opinion.

And what about annotations? Currently it is very easy to pass them from typescript to the JSON schema. With io-ts in between this will probably be more difficult.

I think adding io-ts into the chain just adds more complexity and slows down the project.

@codler
Copy link

codler commented Nov 7, 2019

I was about to start try out Typescript to JSON Schema and then I saw this thread. I am now confused which library I should use. Where are we at today and which one do you recommend to use?

@domoritz
Copy link
Member Author

domoritz commented Nov 7, 2019

@codler I added a tldr to the first message. Does that help?

@cspotcode
Copy link

Thoughts on using typedoc's type extraction engine to power schema generation? Seems like typedoc has more active maintenance and has better kept pace with typescript's type system, using the typechecker for more of the extraction work than this library. We are fairly active on Discord.

@domoritz
Copy link
Member Author

I don't care what we use under the hood as long as we can generate a good schema. I love the idea of using something well maintained as the basis.

What's the benefit over using the typescript compiler?

@cspotcode
Copy link

A few reasons I'm thinking about: motivation, ability to chat with other developers, and avoiding duplicated effort.

I talk to the maintainer and a few other team members pretty regularly on Discord, so it's easier to discuss design decisions, get feedback, we avoid duplicated effort, and it's more motivating to work on typedoc. It's fun to chat and share progress.

Using the compiler API isn't free: it requires understanding the correct way to use it, understanding the quirks, understanding the gotchas to avoid. Typedoc already does that. It uses the typechecker as much as possible except for a few places where it can't.
And the typedoc team is available to explain it.

Typedoc handles exports as you would expect, as opposed to this library which exposes internal types, cannot handle multiple types with the same name, and does not handle alias exports. Typedoc needs to accurately document typescript modules, meaning that the extracted type information is more intuitive: it matches the code one-to-one.

Typedoc has some extra features which might be useful in the future. It can extract type info to a JSON dump, then use that export to render docs multiple times. It might improve performance being able to extract types once and then render multiple schemas.

@domoritz
Copy link
Member Author

That all sounds fantastic. Do you want to try making a POC?

@cspotcode
Copy link

I want to ask about this specifically, since it may be a breaking change if we go with typedoc:

Typedoc handles exports as you would expect, as opposed to this library which exposes internal types, cannot handle multiple types with the same name, and does not handle alias exports.

This library exposes non-exported types in the schema, using their internal, non-exported names. But I'm not sure why that is necessary, and I'm not sure it is compatible with typedoc's extracted reflections. Do you know if the requirement to expose internals is documented anywhere? Do you know if the identifiers assigned to "definitions" is specced anywhere?

I'd like to propose a simpler way to tell the schema extractor which schemas to extract:
The user writes a single file, schemas.ts, which exports one or more types using named exports. The schema generator is given this file as an entrypoint and will emit a single schema containing a "definition" for every type in schemas.ts, with a name matching the named export. All other "definition"s will have verbose, fully-qualified names that include their sourcefile's path to avoid name collisions. If the user wants a top ref, they add a default export.

This pushes configuration into the TypeScript language. schemas.ts can re-export types from elsewhere in a codebase, so it gives you full control over the included schemas. There is an intuitive one-to-one between identifiers in TS and "definition" names in the schema. And name collisions are handled by the TS language.

@domoritz
Copy link
Member Author

I think it can be nice to use internal aliases to name definitions in the schema if we need them (e.g. when we have a recursive data structure). I don't think we usually expose internal aliases otherwise, do we?

I'd like to propose a simpler way to tell the schema extractor which schemas to extract:
The user writes a single file, schemas.ts, which exports one or more types using named exports. The schema generator is given this file as an entrypoint and will emit a single schema containing a "definition" for every type in schemas.ts, with a name matching the named export. All other "definition"s will have verbose, fully-qualified names that include their sourcefile's path to avoid name collisions. If the user wants a top ref, they add a default export.

Where would they add a default export? In the source file or in schemas.ts?

I do like that we would avoid the challenge of duplicate types and make it very explicit what types get exported. However, it's not how typedoc works and some users may get confused why they don't see some types. Maybe it's okay if people already have the relevant types exported in their index.ts but I worry that some types might get very deep.

My only use case for this library right now is https://github.com/vega/vega-lite and it would be a lot of work to export all the relevant types again. Do you have a suggestion for how I could reduce the necessary work?

@cspotcode
Copy link

I don't think we usually expose internal aliases otherwise, do we?

I'm thinking of this flag:

-e, --expose <all|none|export>
    all: Create shared $ref definitions for all types.
    none: Do not create shared $ref definitions.
    export (default): Create shared $ref definitions only for exported types (not tagged as `@internal`).

It's tough to tell how that flag is meant to behave, but does it fail if 2x non-exported types in different files have the same identifier? I'm fine with creating shared ref definitions, but the tool should ensure they do not have naming conflicts, and the names do not need to follow any particular rules. They can be made human-readable, but there do not need to be any guarantees about their name. If the user wants a type to have a guaranteed "definition" name, they can achieve that by exporting it from schemas.ts

Where would they add a default export? In the source file or in schemas.ts?

In schemas.ts

@domoritz
Copy link
Member Author

Where would they add a default export? In the source file or in schemas.ts?

In schemas.ts

I think you meant a normal export then, not a default expert (as there can be only one). Right?

@cspotcode
Copy link

The idea is that the root $ref is determined by the default export, if it exists. Any other named exports will be guaranteed to reside at their name in "definitions". For example:

export {Foo as default};
export {Bar, Baz, Biff};

In the emitted JSON schema:

The root "$ref" is guaranteed to refer to Foo. You can use the emitted JSON schema as-is to validate that an object matches Foo.

"definitions": {"Bar" is guaranteed to be the definition for Bar. There will likely be many other definitions in the emitted JSON, but their names are generated by the tool and may be more verbose, may include filenames encoded in some form, and naming collisions will be avoided.

This means that you can trivially modify the root $ref to use a single schema to validate against Bar, Baz, or Biff because their definitions are guaranteed to reside at those names in "definitions"

@domoritz
Copy link
Member Author

There will likely be many other definitions in the emitted JSON, but their names are generated by the tool and may be more verbose, may include filenames encoded in some form, and naming collisions will be avoided.

Oh, I missed the part about other definitions being included potentially. I see. So definitions that are exported will have a predictable name and any other aliases or types could have arbitrary names (e.g. to include the path). That's a good idea. I think it makes sense overall.

@maneetgoyal
Copy link
Contributor

Hi all, I have been using Zod for input validation in some of my projects. Since, the discussion here seems to be related to producing JSON schema from TS definitions, thought of sharing ts-to-zod and zod-to-json-schema libraries. Combining both, I think we can do TS --> Zod --> JSON Schema. Would love to know what the developers on this forum think about the limitations of these tools.

@domoritz
Copy link
Member Author

That's interesting. I wonder how flexible Zod is, though. Does any information get lost in the intermediate representation?

@cspotcode
Copy link

My immediately thought: does it support customization of the schema with JSDoc @tags? Currently we can customize the JSON schema:

/** @minimum 10 */ foo: number; to set a JSON Schema "minimum"
/** @TJS-default {10} */ foo: number; to set JSON schema default
/** @TJS-type {integer} */ foo: number; to override the schema type to integer instead of number

@maneetgoyal
Copy link
Contributor

Does any information get lost in the intermediate representation?

Running some tests currently. So far, it seems like it can't handle circular dependencies in type definitions. Getting the following warning while running npx ts-to-zod some-src.ts some-dest.ts:

›   Warning: Some schemas can't be generated due to circular dependencies:

@maneetgoyal
Copy link
Contributor

maneetgoyal commented Sep 27, 2021

does it support customization of the schema with JSDoc tags?

As per their docs, they support only 6 of the JSDOC keywords.

From their docs:

// source.ts
export interface HeroContact {
  /**
   * The email of the hero.
   *
   * @format email
   */
  email: string;

  /**
   * The name of the hero.
   *
   * @minLength 2
   * @maxLength 50
   */
  name: string;

  /**
   * The phone number of the hero.
   *
   * @pattern ^([+]?d{1,2}[-s]?|)d{3}[-s]?d{3}[-s]?d{4}$
   */
  phoneNumber: string;

  /**
   * Does the hero has super power?
   *
   * @default true
   */
  hasSuperPower?: boolean;

  /**
   * The age of the hero
   *
   * @minimum 0
   * @maximum 500
   */
  age: number;
}

// output.ts
export const heroContactSchema = z.object({
  /**
   * The email of the hero.
   *
   * @format email
   */
  email: z.string().email(),

  /**
   * The name of the hero.
   *
   * @minLength 2
   * @maxLength 50
   */
  name: z.string().min(2).max(50),

  /**
   * The phone number of the hero.
   *
   * @pattern ^([+]?d{1,2}[-s]?|)d{3}[-s]?d{3}[-s]?d{4}$
   */
  phoneNumber: z.string().regex(/^([+]?d{1,2}[-s]?|)d{3}[-s]?d{3}[-s]?d{4}$/),

  /**
   * Does the hero has super power?
   *
   * @default true
   */
  hasSuperPower: z.boolean().default(true),

  /**
   * The age of the hero
   *
   * @minimum 0
   * @maximum 500
   */
  age: z.number().min(0).max(500),
});

@M-jerez
Copy link

M-jerez commented Nov 30, 2021

Hi first of all thanks for the good work to all the authors of these libraries.

Secondly, If there is gonna be a new version with breaking changes, the jsdoc annotations could/should be ditch in favour of decorators.

@domoritz
Copy link
Member Author

Can you explain why?

@M-jerez
Copy link

M-jerez commented Nov 30, 2021

@domoritz
Decorators are the official way to add metadata to typescript and es6. jsDoc annotations are a workaround used before decorators existed, jsDoc annotations should be used only/mostly for documentation.

Also most of the libraries related to data mapping , orm etc, are using decorators now.
Working with decorators might also be easier more flexible and extensible than jsDoc annotations.

@cspotcode
Copy link

Decorators add runtime metadata to TypeScript and JavaScript classes, but not to other types. This is not a proper analogue for what we want to accomplish here: adding design-time metadata to TypeScript types, including non-classes.

jsDoc annotations are a workaround

This is not true if the metadata is descriptions and deprecation status. That information is best put in JSDoc; moving it elsewhere removes some of its utility.

jsDoc annotations should be used only/mostly for documentation.

This is, I believe, an impractical rule to follow. "Mostly" acknowledges that it is used for more than documentation.

Additionally, JSON schemas extracted by this tool serve as documentation. For example, they can be used to generate OpenAPI specifications. In this use-case, the "metadata" is documentation and should be kept in JSDoc where it simultaneously serves other purposes.

I can imagine scenarios where decorator metadata and JSDoc metadata are used in tandem, but outlawing the use of JSDoc annotations would be counter-productive.

Descriptions should reside in JSDoc so they appear in tooltips and for compatibility with tools such as typedoc. Types must still be annotated using TypeScript syntax. That's a good reason for a schema generator to parse JSDoc comments and type information, even if it combines that information with decorator metadata. If it is valuable for the schema generator to maintain its ability to parse JSDoc and types, then we know that retrieving other JSDoc tags is straightforward. Removing the ability to do that wouldn't simplify the code enough to justify the loss of utility.

Note that emitDecoratorMetadata does not emit sufficient information, in case anyone wanted to bring that up.

@marcj
Copy link

marcj commented Mar 23, 2022

I've read this thread and saw that there is still no sufficient solution available to read computed type information. Just wanted to let your know that we released a TypeScript runtime type system that provides via reflection all information you are looking for to generate a JSON schema. More information can be found here: https://deepkit.io/blog/introducing-deepkit-framework. The concrete feature to support this use case is here: https://deepkit.io/documentation/type/reflection

@4LT
Copy link

4LT commented May 8, 2022

I think I'd like to see TypeScript - or at least a subset of TypeScript - become a first-class format for the purposes of JSON validation and e.g. comparison against OpenAPI specs. It is the most succinct format for expressing Javascript types that I have come across. The one thing I do like about generating an intermediate schema file, though, is that it produces a single artifact, so only that file needs to be included in a compiled project.

Update

Just learned that the latest OpenAPI Spec is based on JSON Schema, huh.

@qwelias
Copy link

qwelias commented Aug 5, 2022

Support toJSON method on objects is required for accurate JS to JSON translation.
toJSON can define custom representation of any JS object wich may not match the object itself.
If toJSON is not respected by JSON schema generators then generated schema may deviate from actual JSON representation of a JS object.

Example: https://github.com/mongodb/js-bson/blob/main/src/objectid.ts#L202

EDIT: example, docs

@marcj
Copy link

marcj commented Aug 5, 2022

It's now possible to use only TypeScript types to describe and generate OpenAPI documents and thus JSON Schema, without code generation whatsoever: https://github.com/hanayashiki/deepkit-openapi

Code like that:

import { MinLength } from '@deepkit/type';

interface User {
  id: number;
  name: string & MinLength<3>;
  password: string;
}

type ReadUser = Omit<User, 'password'>;

type CreateUser = Omit<User, 'id'>;

translates to

  schemas:
    ReadUser:
      type: object
      properties:
        id:
          type: number
        name:
          type: string
          minLength: 3
      required:
        - id
        - name
    CreateUser:
      type: object
      properties:
        name:
          type: string
          minLength: 3
        password:
          type: string
      required:
        - name
        - password
    User:
      type: object
      properties:
        id:
          type: number
        name:
          type: string
          minLength: 3
        password:
          type: string
      required:
        - id
        - name
        - password

@tommedema
Copy link

@marcj does that require using the DeepKit framework though?

@marcj
Copy link

marcj commented Jun 24, 2023

@tommedema no

@4LT
Copy link

4LT commented Jun 25, 2023

From the linked project's own README

Warning: This package is intended to only work with deepkit

@marcj
Copy link

marcj commented Jun 25, 2023

Deepkit Framework is the full thing, like Laravel. But Deepkit is also just a collection of standalone packages you can use separately/standalone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests