Skip to content

Commit

Permalink
Exposes debug.maxEvaluatedPlans planning config (#2593)
Browse files Browse the repository at this point in the history
* Exposes `debug.maxEvaluatedPlans` planning config

So far, the maximum number of query plan evaluated (above which the
query planning eliminate choices to evaluated, thus potentially reducing
the generated plan quality) has only be hard-coded. This exposes
a config option to set that cap, mostly to help debugging query planning
runtime issues.

* Make the  type recursive
  • Loading branch information
Sylvain Lebresne committed May 24, 2023
1 parent e136ad8 commit 8ca107a
Show file tree
Hide file tree
Showing 5 changed files with 219 additions and 15 deletions.
9 changes: 9 additions & 0 deletions .changeset/nasty-panthers-chew.md
@@ -0,0 +1,9 @@
---
"@apollo/query-planner": minor
---

Adds `debug.maxEvaluatedPlans` query planning configuration options. This option limits the maximum number of query plan
that may have to be evaluated during a query planning phase, thus capping the maximum query planning runtime, but at the
price of potentially reducing the optimality of the generated query plan (which may mean slower query executions). This
option is exposed for debugging purposes, but it is recommended to rely on the default in production.

2 changes: 1 addition & 1 deletion internals-js/src/utils.ts
Expand Up @@ -401,7 +401,7 @@ export function printHumanReadableList(
}

export type Concrete<Type> = {
[Property in keyof Type]-?: Type[Property];
[Property in keyof Type]-?: Concrete<Type[Property]>;
};

// for use with Array.filter
Expand Down
171 changes: 168 additions & 3 deletions query-planner-js/src/__tests__/buildPlan.test.ts
@@ -1,10 +1,10 @@
import { QueryPlanner } from '@apollo/query-planner';
import { assert, buildSchema, operationFromDocument, ServiceDefinition } from '@apollo/federation-internals';
import gql from 'graphql-tag';
import { MAX_COMPUTED_PLANS } from '../buildPlan';
import { FetchNode, FlattenNode, SequenceNode } from '../QueryPlan';
import { FieldNode, OperationDefinitionNode, parse } from 'graphql';
import { composeAndCreatePlanner, composeAndCreatePlannerWithOptions } from './testHelper';
import { enforceQueryPlannerConfigDefaults } from '../config';

describe('shareable root fields', () => {
test('can use same root operation from multiple subgraphs in parallel', () => {
Expand Down Expand Up @@ -3032,7 +3032,8 @@ test('Correctly handle case where there is too many plans to consider', () => {
// gets very large very quickly). Obviously, there is no reason to do this in practice.

// Each leaf field is reachable from 2 subgraphs, so doubles the number of plans.
const fieldCount = Math.ceil(Math.log2(MAX_COMPUTED_PLANS)) + 1;
const defaultMaxComputedPlans = enforceQueryPlannerConfigDefaults().debug.maxEvaluatedPlans!;
const fieldCount = Math.ceil(Math.log2(defaultMaxComputedPlans)) + 1;
const fields = [...Array(fieldCount).keys()].map((i) => `f${i}`);

const typeDefs = gql`
Expand Down Expand Up @@ -5749,7 +5750,6 @@ test('does not error out handling fragments when interface subtyping is involved
`);
});


test('handles mix of fragments indirection and unions', () => {
const subgraph1 = {
name: 'Subgraph1',
Expand Down Expand Up @@ -5821,3 +5821,168 @@ test('handles mix of fragments indirection and unions', () => {
}
`);
});

describe('`debug.maxEvaluatedPlans` configuration', () => {
// Simple schema, created to force the query planner to have multiple choice. We'll build
// a supergraph with the 2 _same_ subgraph having this exact same schema. In practice,
// for every field `v_i`, the planner will consider the option of fetching it from either
// the 1st or 2nd subgraph (not that in theory, there is more choices than this; we could
// get `t.id` from the 1st subgraph and then jump to then 2nd subgraph, but some heuristics
// in the the query planner recognize this is not useful. Also note that we currently
// need both the `@key` on `T` and to have `Query.t` shareable for the query to consider
// those choices).
const typeDefs = gql`
type Query {
t: T @shareable
}
type T @key(fields: "id") @shareable {
id: ID!
v1: Int
v2: Int
v3: Int
v4: Int
}
`;

const subgraphs = [
{
name: 'Subgraph1',
typeDefs
}, {
name: 'Subgraph2',
typeDefs }
];

test('works when unset', () => {
// This test is mostly a sanity check to make sure that "by default", we do have 16 plans
// (all combination of the 2 choices for 4 fields). It's not entirely impossible that
// some future smarter heuristic is added to the planner so that it recognize it could
// but the choices earlier, and if that's the case, this test will fail (showing that less
// plans are considered) and we'll have to adapt the example (find a better way to force
// choices).

const config = { debug : { maxEvaluatedPlans : undefined } };
const [api, queryPlanner] = composeAndCreatePlannerWithOptions(subgraphs, config);
const operation = operationFromDocument(api, gql`
{
t {
v1
v2
v3
v4
}
}
`);

const plan = queryPlanner.buildQueryPlan(operation);
expect(plan).toMatchInlineSnapshot(`
QueryPlan {
Fetch(service: "Subgraph1") {
{
t {
v1
v2
v3
v4
}
}
},
}
`);

const stats = queryPlanner.lastGeneratedPlanStatistics();
expect(stats?.evaluatedPlanCount).toBe(16);
});

test('allows setting down to 1', () => {
const config = { debug : { maxEvaluatedPlans : 1 } };
const [api, queryPlanner] = composeAndCreatePlannerWithOptions(subgraphs, config);
const operation = operationFromDocument(api, gql`
{
t {
v1
v2
v3
v4
}
}
`);

const plan = queryPlanner.buildQueryPlan(operation);
// Note that in theory, the planner would be excused if it wasn't generated this
// (optimal in this case) plan. But we kind of want it in this simple example so
// we still assert this is the plan we get.
// Note2: `v1` ends up reordered in this case due to reordering of branches that
// happens as a by-product of cutting out choice. This is completely harmless and
// the plan is still find and optimal, but if we someday find the time to update
// the code to keep the order more consistent (say, if we ever rewrite said code :)),
// then this wouldn't be the worst thing either.
expect(plan).toMatchInlineSnapshot(`
QueryPlan {
Fetch(service: "Subgraph1") {
{
t {
v2
v3
v4
v1
}
}
},
}
`);

const stats = queryPlanner.lastGeneratedPlanStatistics();
expect(stats?.evaluatedPlanCount).toBe(1);
});

test('can be set to an arbitrary number', () => {
const config = { debug : { maxEvaluatedPlans : 10 } };
const [api, queryPlanner] = composeAndCreatePlannerWithOptions(subgraphs, config);
const operation = operationFromDocument(api, gql`
{
t {
v1
v2
v3
v4
}
}
`);

const plan = queryPlanner.buildQueryPlan(operation);
expect(plan).toMatchInlineSnapshot(`
QueryPlan {
Fetch(service: "Subgraph1") {
{
t {
v1
v4
v2
v3
}
}
},
}
`);

const stats = queryPlanner.lastGeneratedPlanStatistics();
// Note that in this particular example, since we have binary choices only and due to the way
// we cut branches when we're above the max, the number of evaluated plans can only be a power
// of 2. Here, we just want it to be the nearest power of 2 below our limit.
expect(stats?.evaluatedPlanCount).toBe(8);
});

test('cannot be set to 0 or a negative number', () => {
let config = { debug : { maxEvaluatedPlans : 0 } };
expect(() => composeAndCreatePlannerWithOptions(subgraphs, config)).toThrow(
'Invalid value for query planning configuration "debug.maxEvaluatedPlans"; expected a number >= 1 but got 0'
);

config = { debug : { maxEvaluatedPlans : -1 } };
expect(() => composeAndCreatePlannerWithOptions(subgraphs, config)).toThrow(
'Invalid value for query planning configuration "debug.maxEvaluatedPlans"; expected a number >= 1 but got -1'
);
});
});
20 changes: 9 additions & 11 deletions query-planner-js/src/buildPlan.ts
Expand Up @@ -97,7 +97,7 @@ import {
import { stripIgnoredCharacters, print, OperationTypeNode, SelectionSetNode, Kind } from "graphql";
import { DeferredNode, FetchDataRewrite } from ".";
import { Conditions, conditionsOfSelectionSet, isConstantCondition, mergeConditions, removeConditionsFromSelectionSet, updatedConditions } from "./conditions";
import { enforceQueryPlannerConfigDefaults, QueryPlannerConfig } from "./config";
import { enforceQueryPlannerConfigDefaults, QueryPlannerConfig, validateQueryPlannerConfig } from "./config";
import { generateAllPlansAndFindBest } from "./generateAllPlans";
import { QueryPlan, ResponsePath, SequenceNode, PlanNode, ParallelNode, FetchNode, SubscriptionNode, trimSelectionNodes } from "./QueryPlan";

Expand All @@ -107,13 +107,6 @@ const debug = newDebugLogger('plan');
// has no particular significance.
const SIBLING_TYPENAME_KEY = 'sibling_typename';

// If a query can be resolved by more than this number of plans, we'll try to reduce the possible options we'll look
// at to get it below this number to void query planning running forever.
// Note that this number is a tad arbitrary: it's a nice round number that, on my laptop, ensure query planning don't
// take more than a handful of seconds.
// Note: exported so we can have a test that explicitly requires more than this number.
export const MAX_COMPUTED_PLANS = 10000;

type CostFunction = FetchGroupProcessor<number, number>;

/**
Expand Down Expand Up @@ -565,7 +558,8 @@ class QueryPlanningTraversal<RV extends Vertex> {
debug.log(() => `Query has ${planCount} possible plans`);

let firstBranch = this.closedBranches[0];
while (planCount > MAX_COMPUTED_PLANS && firstBranch.length > 1) {
const maxPlansToCompute = this.parameters.config.debug.maxEvaluatedPlans;
while (planCount > maxPlansToCompute && firstBranch.length > 1) {
// we remove the right-most option of the first branch, and them move that branch to it's new place.
const prevSize = firstBranch.length;
firstBranch.pop();
Expand Down Expand Up @@ -1335,7 +1329,7 @@ class FetchGroup {
}

toPlanNode(
queryPlannerConfig: QueryPlannerConfig,
queryPlannerConfig: Concrete<QueryPlannerConfig>,
handledConditions: Conditions,
variableDefinitions: VariableDefinitions,
fragments?: RebasedFragments,
Expand Down Expand Up @@ -2685,6 +2679,7 @@ type PlanningParameters<RV extends Vertex> = {
processor: FetchGroupProcessor<PlanNode | undefined, DeferredNode>
root: RV,
inconsistentAbstractTypesRuntimes: Set<string>,
config: Concrete<QueryPlannerConfig>,
}

export class QueryPlanner {
Expand All @@ -2704,6 +2699,8 @@ export class QueryPlanner {
config?: QueryPlannerConfig
) {
this.config = enforceQueryPlannerConfigDefaults(config);
// Validating post default-setting to catch any fat-fingering of the defaults themselves.
validateQueryPlannerConfig(this.config);
this.federatedQueryGraph = buildFederatedQueryGraph(supergraphSchema, true);
this.collectInterfaceTypesWithInterfaceObjects();
this.collectInconsistentAbstractTypesRuntimes();
Expand Down Expand Up @@ -2852,6 +2849,7 @@ export class QueryPlanner {
root,
statistics,
inconsistentAbstractTypesRuntimes: this.inconsistentAbstractTypesRuntimes,
config: this.config,
}

let rootNode: PlanNode | SubscriptionNode | undefined;
Expand Down Expand Up @@ -3320,7 +3318,7 @@ function fetchGroupToPlanProcessor({
operationName,
assignedDeferLabels,
}: {
config: QueryPlannerConfig,
config: Concrete<QueryPlannerConfig>,
variableDefinitions: VariableDefinitions,
fragments?: RebasedFragments,
operationName?: string,
Expand Down
32 changes: 32 additions & 0 deletions query-planner-js/src/config.ts
Expand Up @@ -63,6 +63,26 @@ export type QueryPlannerConfig = {
* normal query planning and instead a fetch to the one subgraph is built directly from the input query.
*/
bypassPlannerForSingleSubgraph?: boolean,

/**
* Query planning is an exploratory process. Depending on the specificities and feature used by
* subgraphs, there could exist may different theoretical valid (if not always efficient) plans
* for a given query, and at a high level, the query planner generates those possible choices,
* evaluate them, and return the best one. In some complex cases however, the number of
* theoretically possible plans can be very large, and to keep query planning time acceptable,
* the query planner cap the maximum number of plans it evaluates. This config allows to configure
* that cap. Note if planning a query hits that cap, then the planner will still always return a
* "correct" plan, but it may not return _the_ optimal one, so this config can be considered a
* trade-off between the worst-time for query planning computation processing, and the risk of
* having non-optimal query plans (impacting query runtimes).
*
* This value currently defaults to 10 000, but this default is considered an implementation
* detail and is subject to change. We do not recommend setting this value unless it is to
* debug a specific issue (with unexpectedly slow query planning for instance). Remember that
* setting this value too low can negatively affect query runtime (due to the use of sub-optimal
* query plans).
*/
maxEvaluatedPlans?: number,
},
}

Expand All @@ -80,7 +100,19 @@ export function enforceQueryPlannerConfigDefaults(
},
debug: {
bypassPlannerForSingleSubgraph: false,
// Note that this number is a tad arbitrary: it's a nice round number that, on my laptop, ensure query planning
// don't take more than a handful of seconds. It might be worth running a bit more experiments on more environment
// to see if it's such a good default.
maxEvaluatedPlans: 10000,
...config?.debug,
},
};
}

export function validateQueryPlannerConfig(
config: Concrete<QueryPlannerConfig>,
) {
if (config.debug.maxEvaluatedPlans < 1) {
throw new Error(`Invalid value for query planning configuration "debug.maxEvaluatedPlans"; expected a number >= 1 but got ${config.debug.maxEvaluatedPlans}`);
}
}

0 comments on commit 8ca107a

Please sign in to comment.