Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assorted bug fixes in new QP, found by a newly-ported test #5157

Merged
merged 12 commits into from
May 14, 2024

Conversation

SimonSapin
Copy link
Contributor

Part of FED-221.


Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Tests added and passing3
    • Unit Tests
    • Integration Tests
    • Manual Tests

Exceptions

Note any exceptions here

Notes

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

Copy link
Contributor

@SimonSapin, please consider creating a changeset entry in /.changesets/. These instructions describe the process and tooling.

@router-perf
Copy link

router-perf bot commented May 13, 2024

CI performance tests

  • step - Basic stress test that steps up the number of users over time
  • events_big_cap_high_rate_callback - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity using callback mode
  • large-request - Stress test with a 1 MB request payload
  • events - Stress test for events with a lot of users and deduplication ENABLED
  • xxlarge-request - Stress test with 100 MB request payload
  • events_without_dedup - Stress test for events with a lot of users and deduplication DISABLED
  • xlarge-request - Stress test with 10 MB request payload
  • step-jemalloc-tuning - Clone of the basic stress test for jemalloc tuning
  • events_callback - Stress test for events with a lot of users and deduplication ENABLED in callback mode
  • no-graphos - Basic stress test, no GraphOS.
  • reload - Reload test over a long period of time at a constant rate of users
  • events_big_cap_high_rate - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity
  • events_without_dedup_callback - Stress test for events with a lot of users and deduplication DISABLED using callback mode
  • const - Basic stress test that runs with a constant number of users

@SimonSapin SimonSapin requested a review from duckki May 13, 2024 18:45
Copy link
Contributor

@duckki duckki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Nice assert_plan! tests and refactoring. Doubly nice for catching/fixing bugs. 👍

@SimonSapin SimonSapin merged commit 6b5d3d0 into dev May 14, 2024
13 of 14 checks passed
@SimonSapin SimonSapin deleted the simon/qp-tests branch May 14, 2024 06:08
SimonSapin added a commit that referenced this pull request May 29, 2024
Reproduction: `cargo test -p apollo-federation -- -q`

## Background

#5157 introduced a new kind of
snapshot test for the Rust query planner. Test schemas are specified as sets
of subgraph schemas, so composing them is needed. Since we don’t have
composition in Rust yet, the tests rely on JS composition through Rover.
To avoid a dependency on Rover on CI and for most contributors,
the composed supergraph are cached in the repository. Rover use is opt-in
and only required when a cached file is missing or out of date.
(Composition inputs are hashed.)

The file name is derived (through a macro) from the test function name.
To detect name conflicts, a static `OnceLock<Mutex<HashSet<&'static str>>>`
is used to ensure no name is used more than once.

## The problem

The static hash set relies on all snapshot tests running in the same process.
This is the case with `cargo test`, but `cargo nextest run` as used on CI
isolates every test in its own process. This breaks conflict detection
of cache file names for composed supergraph schemas, since each test only
"sees" itself in the static hash set.

#5240 introduced a name conflict:
composition is used in a function called twice with different arguments.
Because nextest was used both locally and on CI, the conflict went undectected.
As a result, running `cargo test` on dev fails because the conflict is detected.

## This PR

This PR fixes this case of cache file name conflict, but conflict detection
is still broken with nextest.

As a result it’s possible that this kind of conflict could happen again
and be merged undectected.

## Non-solutions tried

* Nextest has a notion of [test groups](https://nexte.st/book/test-groups),
  but they don’t appear to let multiple tests run in the same process

* Instead of relying on the runtime side effect of tests, could conflict
  detection rely on enumerating tests at compile time?
  The [`linkme` crate](https://crates.io/crates/linkme) is a building block
  Router used to register Rust plugins. It could be used here in all
  composition inputs can be made const, but `std::any::type_name` used
  in a macro to extract the current function name is not `const fn`
  [yet](rust-lang/rust#63084)

## Potential solutions

* Remove the cache and accept the dependency on Rover for testing.
  This impacts CI and all contributors.

* Require every `planner!` macro invocation to specify an explicit
  cache file name instead of relying on the function name.
  Then conflict detection can use `linkme`.

* Move conflict detection to a separate test that something something
  parses Rust source files of other tests something
SimonSapin added a commit that referenced this pull request May 29, 2024
Reproduction: `cargo test -p apollo-federation -- -q`

## Background

#5157 introduced a new kind of snapshot test for the Rust query planner. Test schemas are specified as sets of subgraph schemas, so composing them is needed. Since we don’t have composition in Rust yet, the tests rely on JS composition through Rover. To avoid a dependency on Rover on CI and for most contributors, the composed supergraph are cached in the repository. Rover use is opt-in and only required when a cached file is missing or out of date. (Composition inputs are hashed.)

The file name is derived (through a macro) from the test function name. To detect name conflicts, a static `OnceLock<Mutex<HashSet<&'static str>>>` is used to ensure no name is used more than once.

## The problem

The static hash set relies on all snapshot tests running in the same process. This is the case with `cargo test`, but `cargo nextest run` as used on CI isolates every test in its own process. This breaks conflict detection of cache file names for composed supergraph schemas, since each test only "sees" itself in the static hash set.

#5240 introduced a name conflict: composition is used in a function called twice with different arguments. Because nextest was used both locally and on CI, the conflict went undectected. As a result, running `cargo test` on dev fails because the conflict is detected.

## This PR

This PR fixes this case of cache file name conflict, but conflict detection is still broken with nextest.

As a result it’s possible that this kind of conflict could happen again and be merged undectected.

## Non-solutions tried

* Nextest has a notion of [test groups](https://nexte.st/book/test-groups), but they don’t appear to let multiple tests run in the same process

* Instead of relying on the runtime side effect of tests, could conflict detection rely on enumerating tests at compile time? The [`linkme` crate](https://crates.io/crates/linkme) is a building block Router used to register Rust plugins. It could be used here in all composition inputs can be made const, but `std::any::type_name` used in a macro to extract the current function name is not `const fn` [yet](rust-lang/rust#63084)

## Potential solutions

* Remove the cache and accept the dependency on Rover for testing. This impacts CI and all contributors.

* Require every `planner!` macro invocation to specify an explicit cache file name instead of relying on the function name. Then conflict detection can use `linkme`.

* Move conflict detection to a separate test that something something parses Rust source files of other tests something
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants