Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor selection set implementation to be immutable #2387

Merged
merged 5 commits into from
Mar 15, 2023

Conversation

pcmanus
Copy link
Contributor

@pcmanus pcmanus commented Feb 9, 2023

The current (before this commit) implementation of SelectionSet is inherently mutable: its SelectionSet.add method mutate the selection set it's called on. Which is at odds with the general query planner algorithm, which is the main user of SelectionSet, but is fundamentally based on dealing with immutable data (as we explore various "path" options and the possible plans, most things are shared (at least partly)).

This mismatch between the query planner that really want immutable selection sets but an implementation that is mutable leads to less than optimal code complexity and performance. More specificially, if we want to use a SelectionSet into 2 separate branch of the algorithm, we usually have to do defensive deep copies, which is inefficient. And to mitigate those inefficiencies, we sometimes don't do those copies, but this lead to code that fragile and easy to break and has lead to bug previously (where we save a copy, but then a branch mistakenly mutated and thus impacted another branch mistakenly). A few tricks had been added to the implementation to help mitigate the risks (the freeze behaviour, and the copy-on-write in FetchGroup), but they add complexity and aren't always optimal performance wise.

This commit thus rework the implementation/api of SelectionSet to make it an immutable data structure. Doing so makes the code a lot safer and easy to reason about. And it also provide some performance benefits: compared to current main over a set of production schema/query, this seem to provide around 20% improvement on average and up to almost 50% improvment on some specific cases for the computation of query plans (disclaimer: those benchmark are fairly unscientific so those numbers should be taken with a grain of salt, but the numbers are clear enough that this is a net measurable gain).

@netlify
Copy link

netlify bot commented Feb 9, 2023

👷 Deploy request for apollo-federation-docs pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit ae71a71

@changeset-bot
Copy link

changeset-bot bot commented Feb 9, 2023

🦋 Changeset detected

Latest commit: 11954af

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 7 packages
Name Type
@apollo/composition Patch
@apollo/gateway Patch
@apollo/federation-internals Patch
@apollo/query-graphs Patch
@apollo/query-planner Patch
@apollo/subgraph Patch
apollo-federation-integration-testsuite Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@codesandbox-ci
Copy link

codesandbox-ci bot commented Feb 9, 2023

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

The current (before this commit) implementation of `SelectionSet` is
inherently mutable: its `SelectionSet.add` method mutate the selection
set it's called on. Which is at odds with the general query planner
algorithm, which is the main user of `SelectionSet`, but is
fundamentally based on dealing with immutable data (as we explore
various "path" options and the possible plans, most things are shared
(at least partly)).

This mismatch between the query planner that really want immutable
selection sets but an implementation that is mutable leads to
less than optimal code complexity and performance. More specificially,
if we want to use a `SelectionSet` into 2 separate branch of the
algorithm, we usually have to do defensive deep copies, which is
inefficient. And to mitigate those inefficiencies, we sometimes
don't do those copies, but this lead to code that fragile and easy
to break and has lead to bug previously (where we save a copy, but
then a branch mistakenly mutated and thus impacted another branch
mistakenly). A few tricks had been added to the implementation to
help mitigate the risks (the `freeze` behaviour, and the copy-on-write
in `FetchGroup`), but they add complexity and aren't always optimal
performance wise.

This commit thus rework the implementation/api of `SelectionSet` to make
it an immutable data structure. Doing so makes the code a lot safer
and easy to reason about. And it also provide some performance benefits:
compared to current `main` over a set of production schema/query, this
seem to provide around 20% improvement on average and up to almost 50%
improvment on some specific cases (disclaimer: those benchmark are
fairly unscientific so those numbers should be taken with a grain of
salt, but the numbers are clear enough that this is a net measurable
gain).
@pcmanus pcmanus changed the base branch from main to next March 10, 2023 14:39
@pcmanus pcmanus added this to the 2.4 milestone Mar 10, 2023
@pcmanus pcmanus mentioned this pull request Mar 10, 2023
Copy link
Member

@trevor-scheer trevor-scheer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review still in progress

internals-js/src/definitions.ts Show resolved Hide resolved
internals-js/src/definitions.ts Outdated Show resolved Hide resolved
internals-js/src/operations.ts Show resolved Hide resolved
internals-js/src/operations.ts Outdated Show resolved Hide resolved
internals-js/src/operations.ts Outdated Show resolved Hide resolved
internals-js/src/operations.ts Outdated Show resolved Hide resolved
internals-js/src/operations.ts Show resolved Hide resolved
internals-js/src/operations.ts Outdated Show resolved Hide resolved
internals-js/src/operations.ts Show resolved Hide resolved
internals-js/src/operations.ts Show resolved Hide resolved
pcmanus and others added 3 commits March 14, 2023 10:42
Co-authored-by: Trevor Scheer <trevor.scheer@gmail.com>
Co-authored-by: Trevor Scheer <trevor.scheer@gmail.com>
composition-js/src/validate.ts Outdated Show resolved Hide resolved
@@ -672,10 +631,14 @@ class FetchGroup {
readonly rootKind: SchemaRootKind,
readonly parentType: CompositeType,
readonly isEntityFetch: boolean,
private _selection: LazySelectionSet,
private _inputs?: LazySelectionSet,
private _selection: MutableSelectionSet<{ conditions: Conditions}>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably past time to change this to named parameters, especially when there are so many optionals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a private constructor and the public methods that calls it do use named parameters. Moving to named parameters here means separately declaring all those field and having trivial assignments for each. That'd be fine but it's a lot of noise and I like it the way it is.

query-planner-js/src/buildPlan.ts Outdated Show resolved Hide resolved
internals-js/src/operations.ts Outdated Show resolved Hide resolved
internals-js/src/operations.ts Outdated Show resolved Hide resolved
internals-js/src/operations.ts Outdated Show resolved Hide resolved
}


static of(selectionSet: SelectionSet): MutableSelectionSet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's indeed not used anymore, but I'd like to keep it both for symmetry with empty and because it make sense API wise and has a high change of being use soon enough (and it really isn't, it's hardly a maintenance burden).

internals-js/src/operations.ts Show resolved Hide resolved
query-graphs-js/src/querygraph.ts Outdated Show resolved Hide resolved
query-planner-js/src/buildPlan.ts Outdated Show resolved Hide resolved
Outside of minor typos/updates, the bulk of this change is switching
how we collect used variables to be more efficient/avoid generating
useless garbage.
@pcmanus pcmanus merged commit 260c357 into apollographql:next Mar 15, 2023
pcmanus pushed a commit to pcmanus/federation that referenced this pull request Mar 16, 2023
It turns out that we have had (since fed 2.0 at least, maybe earlier) a
slightly weird behaviour (a bug really) in the validation of selection
sets which impacts what we accept for the `fields` arg of `@key`,
`@requires` and `@provides`. Namely, assuming a field `t` returnin an
object type, we were accepting a selection like:
```
{
  t {
    a
    b
  }
  t
}
```
even though the 2nd occurence of `t` is kind of incorrect according to
the graphQL spec since `t` should always have a sub-selection. But
the reason it works is that the code was merging the 1st and 2nd
occurrence of `t` before any validation was run, so internally the
selection is handled as just:
```
{
  t {
    a
    b
  }
}
```

Now, an assertion was added in apollographql#2387 that is triggered by the example
above, and that means that some `@key`, `@provides` or `@requires` that
were accepted (and were mostly correctly working) in currently released
versions would start erroring in 2.4 because of this.

To be clear, the historical behaviour is kind of wrong here, and we
should consider fixing it at some point. However, hard-failing on
upgrades is not very nice: we should probably introduce a warning
for a few versions before genuinely making this an error. Further,
the current assertion does not provide a very user friendly message.
In the meantime, this PR restore the status quo.
pcmanus pushed a commit to pcmanus/federation that referenced this pull request Mar 16, 2023
It turns out that we have had (since fed 2.0 at least, maybe earlier) a
slightly weird behaviour (a bug really) in the validation of selection
sets which impacts what we accept for the `fields` arg of `@key`,
`@requires` and `@provides`. Namely, assuming a field `t` returnin an
object type, we were accepting a selection like:
```
{
  t {
    a
    b
  }
  t
}
```
even though the 2nd occurence of `t` is kind of incorrect according to
the graphQL spec since `t` should always have a sub-selection. But
the reason it works is that the code was merging the 1st and 2nd
occurrence of `t` before any validation was run, so internally the
selection is handled as just:
```
{
  t {
    a
    b
  }
}
```

Now, an assertion was added in apollographql#2387 that is triggered by the example
above, and that means that some `@key`, `@provides` or `@requires` that
were accepted (and were mostly correctly working) in currently released
versions would start erroring in 2.4 because of this.

To be clear, the historical behaviour is kind of wrong here, and we
should consider fixing it at some point. However, hard-failing on
upgrades is not very nice: we should probably introduce a warning
for a few versions before genuinely making this an error. Further,
the current assertion does not provide a very user friendly message.
In the meantime, this PR restore the status quo.
clenfest pushed a commit that referenced this pull request Mar 16, 2023
It turns out that we have had (since fed 2.0 at least, maybe earlier) a
slightly weird behaviour (a bug really) in the validation of selection
sets which impacts what we accept for the `fields` arg of `@key`,
`@requires` and `@provides`. Namely, assuming a field `t` returnin an
object type, we were accepting a selection like:
```
{
  t {
    a
    b
  }
  t
}
```
even though the 2nd occurence of `t` is kind of incorrect according to
the graphQL spec since `t` should always have a sub-selection. But
the reason it works is that the code was merging the 1st and 2nd
occurrence of `t` before any validation was run, so internally the
selection is handled as just:
```
{
  t {
    a
    b
  }
}
```

Now, an assertion was added in #2387 that is triggered by the example
above, and that means that some `@key`, `@provides` or `@requires` that
were accepted (and were mostly correctly working) in currently released
versions would start erroring in 2.4 because of this.

To be clear, the historical behaviour is kind of wrong here, and we
should consider fixing it at some point. However, hard-failing on
upgrades is not very nice: we should probably introduce a warning
for a few versions before genuinely making this an error. Further,
the current assertion does not provide a very user friendly message.
In the meantime, this PR restore the status quo.
pcmanus added a commit to pcmanus/federation that referenced this pull request Aug 7, 2023
The previously committed [apollographql#2713](apollographql#2713) fixed an issue introduced by
[apollographql#2387](apollographql#2387), ensuring that querying the same field with different
directives applications was not merged, similar to what was/is done for fragments. But the exact behaviour slightly
differs between fields and fragments when it comes to `@defer` in that for fragments, we never merge 2 similar fragments
where both have `@defer`, which we do merge for fields. Or to put it more concretely, in the following query:
```graphq
query Test($skipField: Boolean!) {
  x {
    ... on X @defer {
      a
    }
    ... on X @defer {
      b
    }
  }
}
```
the 2 `... on X @defer` are not merged, resulting in 2 deferred sections that can run in parallel. But following
[apollographql#2713](apollographql#2713), query:
```graphq
query Test($skipField: Boolean!) {
  x @defer {
    a
  }
  x @defer {
    b
  }
}
```
_will_ merge both `x @defer`, resulting in a single deferred section.

This fix changes that later behaviour so that the 2 `x @defer` are not merged and result in 2 deferred sections,
consistently with both 1) the case of fragments and 2) the behaviour prior to
[apollographql#2387](apollographql#2387).
pcmanus added a commit to pcmanus/federation that referenced this pull request Aug 7, 2023
The previously committed [apollographql#2713](apollographql#2713) fixed an issue introduced by
[apollographql#2387](apollographql#2387), ensuring that querying the same field with different
directives applications was not merged, similar to what was/is done for fragments. But the exact behaviour slightly
differs between fields and fragments when it comes to `@defer` in that for fragments, we never merge 2 similar fragments
where both have `@defer`, which we do merge for fields. Or to put it more concretely, in the following query:
```graphq
query Test($skipField: Boolean!) {
  x {
    ... on X @defer {
      a
    }
    ... on X @defer {
      b
    }
  }
}
```
the 2 `... on X @defer` are not merged, resulting in 2 deferred sections that can run in parallel. But following
[apollographql#2713](apollographql#2713), query:
```graphq
query Test($skipField: Boolean!) {
  x @defer {
    a
  }
  x @defer {
    b
  }
}
```
_will_ merge both `x @defer`, resulting in a single deferred section.

This fix changes that later behaviour so that the 2 `x @defer` are not merged and result in 2 deferred sections,
consistently with both 1) the case of fragments and 2) the behaviour prior to
[apollographql#2387](apollographql#2387).
pcmanus added a commit to pcmanus/federation that referenced this pull request Aug 10, 2023
The previously committed [apollographql#2713](apollographql#2713) fixed an issue introduced by
[apollographql#2387](apollographql#2387), ensuring that querying the same field with different
directives applications was not merged, similar to what was/is done for fragments. But the exact behaviour slightly
differs between fields and fragments when it comes to `@defer` in that for fragments, we never merge 2 similar fragments
where both have `@defer`, which we do merge for fields. Or to put it more concretely, in the following query:
```graphq
query Test($skipField: Boolean!) {
  x {
    ... on X @defer {
      a
    }
    ... on X @defer {
      b
    }
  }
}
```
the 2 `... on X @defer` are not merged, resulting in 2 deferred sections that can run in parallel. But following
[apollographql#2713](apollographql#2713), query:
```graphq
query Test($skipField: Boolean!) {
  x @defer {
    a
  }
  x @defer {
    b
  }
}
```
_will_ merge both `x @defer`, resulting in a single deferred section.

This fix changes that later behaviour so that the 2 `x @defer` are not merged and result in 2 deferred sections,
consistently with both 1) the case of fragments and 2) the behaviour prior to
[apollographql#2387](apollographql#2387).
jeffjakub pushed a commit that referenced this pull request Aug 15, 2023
…ly (#2720)

The previously committed [#2713](#2713) fixed an issue introduced by
[#2387](#2387), ensuring that querying the same field with different
directives applications was not merged, similar to what was/is done for fragments. But the exact behaviour slightly
differs between fields and fragments when it comes to `@defer` in that for fragments, we never merge 2 similar fragments
where both have `@defer`, which we do merge for fields. Or to put it more concretely, in the following query:
```graphq
query Test($skipField: Boolean!) {
  x {
    ... on X @defer {
      a
    }
    ... on X @defer {
      b
    }
  }
}
```
the 2 `... on X @defer` are not merged, resulting in 2 deferred sections that can run in parallel. But following
[#2713](#2713), query:
```graphq
query Test($skipField: Boolean!) {
  x @defer {
    a
  }
  x @defer {
    b
  }
}
```
_will_ merge both `x @defer`, resulting in a single deferred section.

This fix changes that later behaviour so that the 2 `x @defer` are not merged and result in 2 deferred sections,
consistently with both 1) the case of fragments and 2) the behaviour prior to
[#2387](#2387).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants