Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Changes to the existing API and workflow #115

Open
danielrearden opened this issue Jun 15, 2020 · 12 comments
Open

[RFC] Changes to the existing API and workflow #115

danielrearden opened this issue Jun 15, 2020 · 12 comments

Comments

@danielrearden
Copy link
Owner

There’s a number of issues with the library in its current form:

  • Reliance on schema directives means the library is incompatible with code-first libraries like nexus, type-graphql or graphql-compose
  • Transforming the schema, whether done through schema directives or some other mechanism (like annotations), makes it hard to work with other tools like GraphQL Code Generator or IDE plugins.
  • Exposing model details in the type definitions tightly couples the schema with the underlying data sources and the library itself. Migrating away from Sqlmancer would require not only changing the resolvers but changing all the type definitions as well.
  • Exposing model details in the type definitions means we’re forced to utilize code generation -- the client’s type cannot just be inferred by TypeScript
  • A schema-first approach means that if additional data models needed for internal use, they have to be exposed first as GraphQL types that are later removed from the schema using the @private directive

What we could do differently:

  • Remove all schema directives and use a single, standalone configuration object that includes all data model details.
  • The data models defined inside the configuration object would resemble more traditional ORM data models and become the “source of truth” for the application. “Base” type definitions could be generated from the models as a convenience but type definitions could also be written by hand. Similarly, migration files could also be generated to keep the database in sync with the models.
  • The data models could still include information specific to their respective GraphQL types, like “virtual” and paginated fields

The new workflow would look something like this:

  • Write data models
  • (Optionally) generate and run migrations based on the models
  • (Optionally) generate type definitions from the models
  • Write any remaining type definitions for the schema
  • Instantiate client using only config object/data models and use it to write the resolvers
  • Build the schema however you normally build your schema, with no extra steps

The API of the client itself would stay the same. When the data models change, the base type definitions can be regenerated and a new migration file can be created to sync the database.

In some ways, this workflow would be potentially more complicated than the existing one. Adding a where or orderBy argument to a field would require more steps than just sticking a directive on a field. On the other hand, utilizing a configuration object means we can leverage TypeScript for type checking and autocompletion, making it less error prone than stumbling through trying to add directives with the correct arguments.

Any "compliant" schema will work with the client, regardless of how it's created, opening up the possibility to use Sqlmancer with code-first schema libraries or even writing schemas using the core graphql module. The code generation feature could be expanded in the future to include generating code for these libraries without having to create additional, Sqlmancer-specific plugins for them.

@tsiege
Copy link
Contributor

tsiege commented Jun 16, 2020

I'm not sure how helpful this is, but I figured I'd share why I started using SqlMancer, what I'm using it for, and what I do and don't like about it so far.

So originally I cam to SqlMancer because I was having issues getting JoinMonster to play well with TypeScript and Apollo. What I like about SqlMancer is that it is similar to JoinMonster in that it doesn't really get in the way of writing my API. All I do is add my definitions to the GraphQL via directives and a little bit of code and my resolvers are auto generated for me. I also liked that it prevents n+ queries. I can still easily generate TS definitions of queries on the front end via graphql-tools.

I'm not sure exactly what limitations there might be surrounding mutations, as I've set to use SqlMancer to write any. My biggest concern about the library is from a security stand point I don't like that my database schema is exposed through my Graphql API, but I feel like this could be stripped out in the createSqlmancerClient function.

I think this is a great project that has helped me quickly bootstrap an API, and I'm glad to see you're putting so much thought into it. I think a lot of these proposed changes make sense, but just wanted to share experience. Keep up the good work!

@danielrearden
Copy link
Owner Author

@tsiege Thanks for the feedback; it's very helpful.

My biggest concern about the library is from a security stand point I don't like that my database schema is exposed through my Graphql API

FWIW, any schema directives you use are never exposed through introspection, so none of the metadata you provide about your database is leaked by your GraphQL service -- it's all internal.

@yanickrochon
Copy link

Reference #122

@wtrocki
Copy link

wtrocki commented Jun 19, 2020

I have a very opinionated take on this.
Wider amount of people have already databases with data and just want to expose it as GraphQL using the most secure way. Yet the market totally ignores them and gives people way to start by creating tables and managing them. In most companies none of the tools - no matter if they coming from an individual or larger company will be allowed to touch the database.

Database forms naturally and it is usually part of the larger subsystems.
People very rarely will have just GraphQL API and very very very rarely want GraphQL to dictate their database schemas. This happens often for startups and single dev projects that want to push things fast but this area in GraphQL is absolutely saturated - sadly with projects bringing more and more confusion, bringing code first, schema first approaches that are not resolving the core issue.

GraphQL appeared as a new thing and many people thought it is like OData - you get CRUD capabilities out of the box. The overall challenges with GraphQL and the amount of knowledge required to build proper scalable, secure GraphQL API exceeds amount of time a regular team can spend to build it - they need some extra tools - but none of them are available to real people.

People involved in GraphQL for years are a little bit like church - they forget how easy it was to build REST API's and push them to production without extensive toolings.

That is why we taking different approach - instead of building yet another tool we trying to resolve problem in the most comprehensive way - following the paths of OData (hopefully not the fame) by building spec based on the commercially available GraphQL offerings:

https://graphqlcrud.org

This spec is done with number of companies who like to know how to build their API's with querying capabilities. Spec abstracts from the approach - if you building schema, database or code first you will still need to conform to this spec and it will be done in the opensource spirit.

Thanks to the spec developers not only getting less much vendor locking but also consistent and well-documented approach from client/query side (where most of the frameworks will be focusing on the server-side of things)

We trying to adapt this spec and try it in most popular languages (Java, Golang, and JS).
It will be hard to enforce big players to conform with this schema, but some of them already do and we are confident that we can bring this to the ecosystem.

We need to bring this into ecosystem to have some interoperability between various clients.

Transforming the schema, whether done through schema directives or some other mechanism (like annotations), makes it hard to work with other tools like GraphQL Code Generator or IDE plugins.

I do not think that this is actually an issue. This is what actually people want!
You could see this output schema as schema you work with. And tool being a preprocessor to reduce boilerplate for people to write schema conforming to some standard - rather than generating that somewhere deep in the code. Security rules in most of the companies will not allow people to dynamically generate API - public API needs to be assessed and controlled, especially when small bugs (and maybe new features) in the library can suddenly expose things they should not.

People can work with output as well. GraphQL-Config can pickup schema from the running server. Also, you can have CLI that will dump schema. We have both in some lib and it works very very good with Codegen.

A schema-first approach means that if additional data models needed for internal use, they have to be exposed first as GraphQL types that are later removed from the schema using the @Private directive

When processing schema all types that are not part of the root Query, Mutation types will be removed.
No need for @private

Remove all schema directives and use a single, standalone configuration object that includes all data model details.

Yep that makes sense, but maybe not all of them. Where we failed in my previous libs is that we provide too many annotations/directives and it totally obfuscated schema. Directives were not designed to be thrown all around the place - especially when adding them can cause some serious effects.


TL;DR I like the current architecture of the SQLMancer - it provides what people want - smaller schemas, better control and flexibility that small teams will expect.

The proposed approach seems like step backward - discarding how SQLmancer differentiates on the market.


I think that you should get the best features from your library and protect them as much as possible.
What I see based on my experience when building CRUD API's - you just need to pull some of the annotations/directives into config (and that was my first thought when I tried SQLMancer) but please do not try to replicate (questionable) architecture decisions of your competition in order to be compatible.

Let's work on https://graphqlcrud.org - which is annotation agnostic and together we can make others want to be compatible with us :)

@tsiege
Copy link
Contributor

tsiege commented Jun 22, 2020

I think @wtrocki makes a lot of good points, specifically about the problem of allowing a graphql schema dictate your database schema. I'm currently running into that issue now, and the direction I'm leaning is similar to that of my previous project. There we had a well defined graphql layer that generated TS types for custom resolvers or just found fields based on convention. For the db we were using redis with a custom orm, but they were designed to work together.

Overall I liked that system, but where it struggled was finding that common interface for the graphql schema/resolvers and the separate concerns of how the database needs to store information as well as at times grow independently. I think that sqlmancer could really set itself apart from the other more "feature complete" graphql postgres libraries by focusing on this issue.

I'm not trying to propose a solution, but just floating the idea of what type of system I would like as my system grows in complexity. The biggest problem with an graphql API is having to constantly describe your data twice. My graphql API looks like this, my database looks like this. It'd be nice if there was a common a interface between the two that abstracts away these complexities. I'm not sure if that means one place to describe common schema and another two describe the unique characteristics of each, or a graphql schema that follows some conventions for fields like how sqlmancer currently works, but with a separate schema for your data.

@danielrearden
Copy link
Owner Author

danielrearden commented Jun 23, 2020

That's really the crux of the issue -- a lot of the time there's significant overlap between your data models and your GraphQL types... except for the times when there's not. And this cuts both ways -- your GraphQL types may deviate from your underlying data models to incorporate different sources or simply to accommodate the needs of the clients consuming the API. At the same time, your data models may include information you don't want to expose in your API, whether that's private data like passwords or internal details like foreign keys.

I really like the idea of just shoving everything into SDL. It's nice to be able to use that as the source of truth, but doing so also feels like you're working backwards. If we go the model-first route, then we can still avoid the "double declaration" through code generation.

If we start with a config file like this:

databases:
  pg:
    dialect: postgres
    models:
      film:
        fields:
          title:
            type: String
          ...
generate: 
  client:
    output: ./src/lib/sqlmancer.js
  typeDefs:
    output: ./src/graphql/base.graphql

(Note: Realistically, I would also provide some option to look for model definitions across multiple files. Stuffing everything inside a single config file would get unwieldy very quickly).

Then minimally we can generate a base set of typeDefs for each model: Film, FilmWhere, FilmOrderBy, etc. Building the rest of your schema using these base types would then be a breeze. If your model changes, the type definitions can be just be generated again. Type extension syntax could still be used if additional fields needed to be added to any particular model type only on the API side.

We can take this a step further and generate GraphQL CRUD-compliant queries and mutations for each model. And going even further, we could also generate resolvers for those queries and mutations.

If the config includes some additional database-specific information, then we can also generate knex migration files by introspecting the existing database and comparing that to the models (a la graphql-migrate). We also open the door to generating other things, like OpenAPI endpoints or Protocol Buffer messages.

I think this provides an equivalent if not greater level of convenience as the current schema-first approach. Using YAML or JSON to define the models instead of using SDL might be a bit more verbose. And even though some of your type definitions are generated for you, you still have separate files to maintain for both type defs and models. Outside of that, though, I'm struggling to come up with additional cons to this approach.

@tsiege
Copy link
Contributor

tsiege commented Jun 23, 2020

Sounds like a new kind of schema :) but whatever you call it, I think that's probably the right approach

@tsiege
Copy link
Contributor

tsiege commented Jun 26, 2020

I've been giving this some thought, and I think it'd actually be best to have a typescript schema instead of something like a yaml file. To me it'd be nice to have a schema written in typescript using perhaps interfaces. You would then have a base class that all interfaces inherit from that support a set of types, which can be translated into generated graphql types and postgres types. A custom type could then be extended following some sort of convention to add specific fields onto a specific layer of the API, GraphQL or the Database. This will have benefits of using a native part of the language people are familiar with, the ability to add runtime checking with something like typescript-is, and provide compile time checking/hints to the end user building out their schema.

interface SQLMancerBase {
  [k: string]: <union of supported types>
}

interface Post extends SQLMancerBase {
  id: ID
  title: String
  slug: String
  text: String
  authorId: ID
}

interface GraphQLPost extends Post {
  excerpt: Post['text']
}

interface PGPost extends Post {
  internalTrackingId: ID
}

@danielrearden
Copy link
Owner Author

danielrearden commented Jun 27, 2020

Thanks @tsiege . I like the idea of using TypeScript types too, especially if it means we don't have to rely on graphql-codegen for typing resolvers. But there's still the question of including all the necessary metadata in that context -- presumably we'd have to use classes so we could use class decorators. And we'd either end up excluding non-TS users or end up maintaining a separate way for JS users to build the models -- neither of which I'm particularly keen on doing.

I've been working this week on implementing the models in SDL (similar to what Prisma 1 does). This has the benefit of being a lot less verbose than YAML or TOML and utilizes a familiar syntax. Here's a gist demonstrating what that would look like: https://gist.github.com/danielrearden/9b968643eb3aac684183db6dcbdf6fd4

@danielrearden
Copy link
Owner Author

The more I think about it, though, the more redundant and inconvenient this sort of approach looks. A middle of the road approach might be to keep the library mostly as-is, but either explicitly compile the type definitions into a new file or just add a way for the CLI to dump the generated schema like Wojtek suggested.

@JeffML
Copy link

JeffML commented Jun 27, 2020

It sounds to me like talk of a Universal Data Model Definition Language, something than can generate (or reverse-engineer) SQL schemas, GraphQL schemas and resolvers, interfaces and implementing PODO classes, and perhaps Swagger-y stuff as well. Also, why not a form-based React app as well?

Has this been attempted before? Yep. UML was your first 'no-code' generator of all things, and Rational Rose is still around so far as I know. I myself worked on an application generator with a team of 50 people using something called Semantic Object Modeling many many years ago. You would visually design a model, it would generate a database, an app to populate it, and a later a website to host it. It also would handle data migration on schema changes.

I don't think you're trying to be that ambitious, but I do wonder if the missing piece is the ability to describe a data model declaratively, one that is not constrained by one type system or another (in other words, SQL data types vs GraphQL datatypes vs TypeScript types) and is (as much as possible) implementation language independent. If you're going to generate something like SQL schemas or GraphQL, you'll need that type information, but in many cases you could make a best guess and leave the fine-tuning to the user.

So I guess my take (perhaps not helpful) is "How can I describe a data model in a way where 1) I can generate a schema or interface from it, along with reasonable implementations of operations that the user can build upon without lock-in; 2) I can take a reverse approach and build the UDML from a set of supported sources (like SQL schemas or GraphQL defs, or TypeScript interfaces.

You also mentioned data-migration based on Schema deltas. That's a very interesting problem and I don't mean that in a good way. But it is doable.

If I am missing the point of the discussion, forgive me.

@tsiege
Copy link
Contributor

tsiege commented Jul 1, 2020

Having spent some more time with sqlmancer and the current API as well as expanding on my data side, I'm also more in @wtrocki's camp. Right now what I like about sqlmancer is that it ingests my schema, generates all the trivial resolvers I need, and gives me the tools generate the intermediate ones via directives and the query builder for more robust ones that directives can't meet. I avoided prisma because it felt way too heavy handed and I came here as an alternative to join monster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants