Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance(stitching): increase key flexibility when type merging #1888

Merged
merged 12 commits into from Aug 11, 2020

Conversation

yaacovCR
Copy link
Collaborator

@yaacovCR yaacovCR commented Aug 5, 2020

@theguild-bot
Copy link
Collaborator

theguild-bot commented Aug 5, 2020

The latest changes of this PR are available as alpha in npm: 6.0.18-alpha-640de854.0

Quickly update your package.json by running:

npx match-version @graphql-tools 6.0.18-alpha-640de854.0

Copy link
Contributor

@gmac gmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Now I see what you mean about resolvers contributing a field-level selection set. That works, although it makes for some confusing config. The fact that merge config builds args using keys not established in merge config is really indirect, and the unimplemented resolver is equally confusing. Would it be reasonable to make a merge config option that effectively serves as an alias for field resolvers? something like:

Product: {
  selectionSet: { upc },
  fieldSelectionSets: {
    shippingEstimate: { weight price }
  },
  args: ...
}

I guess a field selection set declaration there would need to be merged with a resolver declaration under the hood if they both existed. While that’s mostly just config sugar, it’d make the 95% use case a lot more intuitive.

Then a couple other thoughts here:

  • The _productByRepresentation method doesn’t really look like federation, because it’s a single-return service. To look like an equivalent of the federation spec “_entities”, it should really accept an array of representations and return an array of products. The merge config would then be updated to demonstrate using the “key” function to map an array, and “args” function would plug that array into an argument called “representations”, like the spec. Oh, and the schema needs to be updated to match that method.

  • Then, if you’re shying away from using federation’s freeform _Any scalar — why go half way? Might as well just make representations an input object type and send those? That’d be the strongest way to implement the federation _entities pattern for a single known type.

@yaacovCR
Copy link
Collaborator Author

yaacovCR commented Aug 6, 2020

But it didn't work :(

@gmac
Copy link
Contributor

gmac commented Aug 6, 2020

Was the resolver the issue? Doesn’t look like the schema matches the new resolver name, which would be an error

@yaacovCR
Copy link
Collaborator Author

yaacovCR commented Aug 6, 2020

No, that didn't fix it. :( Thought it should, though. In terms of the rest of your comments, I basically agree with all of them, this was sadly just a proof of concept that actually proved the opposite, at least for now...

@yaacovCR yaacovCR force-pushed the merge-federated-schemas branch 2 times, most recently from c46531c to d1b26de Compare August 10, 2020 04:57
@yaacovCR
Copy link
Collaborator Author

This PR allows specification of a static key based on the presence of a particular field, paving the way for addition of a function to include possibly more complex behavior (presence of a particular sub-sub-subfield, for example).

This requires dynamic calculation of proxiability from schema to schema at merge time (since now query-dependent) rather than at build-time/start-up. Given that, we now employ merge time assessment of presence of key within the collected subschemas, which allows the key fields to be spread across subschemas.

For instance, if subschema A contains field keyPartA which is sufficient for subschema A and subschemaB contains keyPartB sufficient for subschemaB, no subschema contains both keyPartA and keyPartB, but subschema X requires both keyPartA and keyPartB to retrieve the type, merging is now possible. Previously, subschemas A and B would have been deemed nonproxiable to subschema X at buildtime, and the combination of the different subschemas at merge time would not have been considered.

@gmac
Copy link
Contributor

gmac commented Aug 10, 2020

I've got to go by your comment above (the code extends way beyond my knowledge of internals)...

For instance, if subschema A contains field keyPartA which is sufficient for subschema A and subschemaB contains keyPartB sufficient for subschemaB, no subschema contains both keyPartA and keyPartB, but subschema X requires both keyPartA and keyPartB to retrieve the type, merging is now possible.

Perfect. That is exactly what I'd expect based on how federation does it.

Previously, subschemas A and B would have been deemed nonproxiable to subschema X at buildtime, and the combination of the different subschemas at merge time would not have been considered.

Was this a known limitation that produced a buildtime error, or was there an alternative strategy for making a request to subschema X with a composite key?

@yaacovCR
Copy link
Collaborator Author

The composite key previously had to be entirely present in at least one subschema.

also consolidates equivalent selections

add memoization

This could be optimized differently: if there was a mapping of fields within the gateway schema to the subschemas containing those fields, we could instead iterate through the document and collect the originating subschemas and then add the keys for each of those subschemas. This method seems possibly about as good.
allow keys whose source data is spread across multiple schemas
key fields for a given subschema by definition must be resolved outside that subschema, and so for any given key field, there is no need to add the other key fields to the selectionSet or to consider the key field as proxiable via that subschema.

The key field is the "key" to the remaining fields, but -- for the purposes of type merging -- will never be obtained from the given subschema.

This becomes more complex with non-leaf key fields, and so this pruning is enabled only for leaf key fields.
to favor input object type over scalar

see #1888 (review)
@yaacovCR yaacovCR changed the title demonstrate merging of federated-like schemas enhance(stitching): allow keys only partially present in each subschema Aug 11, 2020
@yaacovCR yaacovCR changed the title enhance(stitching): allow keys only partially present in each subschema enhance(stitching): increase key flexibility when type merging Aug 11, 2020
@yaacovCR yaacovCR merged commit 640de85 into master Aug 11, 2020
@yaacovCR yaacovCR deleted the merge-federated-schemas branch August 11, 2020 10:54
@yaacovCR
Copy link
Collaborator Author

@gmac I again am not super familiar with federation, but I thought with respect to federation that key fields were "owned" by a particular subschema, so that the above situation would never arise. Was it possible for a key for a type to be owned by different sub schemas?

My understanding was that prior to this change there was feature parity between federation and type merging, and now there is even greater flexibility with stitching/type merging. Hence, the lack of documented limitation.

I may have been wrong, however, in which case I guess feature parity has now been restored?

@gmac
Copy link
Contributor

gmac commented Aug 11, 2020

@yaacovCR I certainly haven't read all through the the federation internals (or tools internals for that matter), but anecdotally, yes, federation allows keys to be owned by multiple services. All of the mapping is established via schema directives, for better or worse. Let's go by this example here: #1874 (reply in thread) (which is actually a more "pure" demo than Apollo's official docs).

The way the federation schema would be written for that is (going by memory):

# Products service...

type Product @key(fields: "upc") {
  upc: String!
  name: String!
  price: Int!
  weight: Int!
}

# Inventory service...

type Carrier {
  id: Int!
  name: String!
}

extend type Product @key(fields: "upc") {
  upc: String! @external
  price: Int! @external
  weight: Int! @external
  shippingEstimate: Int! @requires(fields: "price weight")
  carrier: Carrier! @requires(fields: "weight")
}

So there's a lot of strange stuff going on there, which is one of my biggest complaints about federation. Anywho, breaking it down:

  • Product type originates in the products service, thus the non-extended type definition is there. It cites a @key directive for the field(s) that it can actually be looked up by. That generally makes sense.

  • Product type is extended in inventory service, hence the extend keyword. Here's where things get really strange though: the extension also requires a @key field, not because the key actually does anything here, but because a common key needs to follow a representation (virtual record) through the entire service resolution chain. In this case, all the @key is really doing is opting the extension into the magic federation entity. So, @key is roughly synonymous with TypeMergeConfig.

  • Then inventory Products have upc, price, and weight fields marked as @external, which means they are fields that don't originate in this service. They've been selected by the gateway from other services and injected as fields of a virtual record. Type merging omits this concept in favor of only defining fields that actually exist in the service, which makes more sense to me.

  • Then shippingEstimate and carrier use the @requires directive which is most relevant to this PR here... those @requires assign field-level dependencies for each external field that they require. A service doesn't know where required fields actually come from – the gateway loads all these SDLs, compiles the directive mapping, and then will build a query plan to select fields from across services in the order they are required. In the case of shippingEstimate: Int! @requires(fields: "price weight"), if the gateway needs to get price and weight from different services, so be it. This gets at the origins of this whole discussion, which is that it'd be comparable to Federation if type merging could define a configuration such as:

Product: {
  selectionSet: selectionSetByField({
    shippingEstimate: '{ price weight }',
    carrier: '{ weight }'
  }),
  fieldName: '_entities',
  key: ({ weight, price }) => ({ weight, price, __typename: 'Product' }),
  args: (representations) => ({ representations }),
}

All that is to say – while I like what federation does, I personally think the SDL is stupidly complex, so I was quite surprised to realize that type merging is like a bare-metal/common sense version of federation. Feature parity is getting pretty good! I guess the main feature that Federation wears on its sleeve is that all concerns are encapsulated in subservices with no logic in the gateway–but that comes at the cost of a really complex SDL. Depending on viewpoint, one could argue that Type Merging's thin layer of gateway config is the superior feature in this regard.

@yaacovCR
Copy link
Collaborator Author

The change we are discussing is if one schema used upc as the key and did not contain name, another used the name and did not have upc, and a third (let's say containing shipping estimate) required both.

I don't think it's supported by a federation or a common pattern, but now we support it...

@gmac
Copy link
Contributor

gmac commented Aug 11, 2020

Yeah, wow, I have no idea how Federation would handle that scenario. Awesome. In general terms, how high are the additional computational costs associated with that dynamic assessment?

@yaacovCR
Copy link
Collaborator Author

O(n) with n being number of subschemas with the merged type?

I think?

@gmac
Copy link
Contributor

gmac commented Aug 12, 2020

I only ask because Federation acknowledges overhead with their query planning, to the point that they actually cache plans up to a set storage limit. Sounds slightly ominous.

@gmac
Copy link
Contributor

gmac commented Aug 12, 2020

Oh, also – I've got an expanded merging guide in the works that'll document the knowledge about these new options!

@yaacovCR
Copy link
Collaborator Author

It looks like the link you have to caching is for the nautilus gateway written in Go. Not sure whether the Apollo JS gateway also includes caching.

We can add caching to speed up type merging logic. They have a benchmark that can be used to measure just the time it takes to do query planning. We have here a GraphQL schema that at each step of execution figures out what types it has and what it needs -- it would be nice if we could add logging to see how long that takes or to have a nice benchmark with comparable schemas/queries that can be used to do end-to-end testing.

I was thinking of pre-calculating whether the different combinations of subschemas contain each selectionSet, but I do believe that the necessary pre-calculation would be quite prohibitive consider the number of possible combinations as there is no upper bound on the number of possible subschemas with types to be merged. Another possibility is to use a bounded cache, which I like better. We could expose a property with an optional custom function to possibly replaces typesContainsSelectionSet that can determine whether it is possible to proxy to a given subschema given the types as defined from the subschemas gotten so far, and the subschema trying to get to....

@gmac gmac mentioned this pull request Aug 14, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants