Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release: v1.43.0 #4821

Merged
merged 25 commits into from
Mar 22, 2024
Merged

release: v1.43.0 #4821

merged 25 commits into from
Mar 22, 2024

Conversation

abernix
Copy link
Member

@abernix abernix commented Mar 19, 2024

Note
This particular PR must be true-merged to main.

  • This PR is only ready to review when it is marked as "Ready for Review". It represents the merge to the main branch of an upcoming release (version number in the title).
  • It will act as a staging branch until we are ready to finalize the release.
  • We may cut any number of alpha and release candidate (RC) versions off this branch prior to formalizing it.
  • This PR is primarily a merge commit, so reviewing every individual commit shown below is not necessary since those have been reviewed in their own PR. However, things important to review on this PR once it's marked "Ready for Review":
    • Does this PR target the right branch? (usually, main)
    • Are the appropriate version bumps and release note edits in the end of the commit list (or within the last few commits). In other words, "Did the 'release prep' PR actually land on this branch?"
    • If those things look good, this PR is good to merge!

nicholascioli and others added 20 commits March 8, 2024 10:00
This commit ups the timeout used when verifying that the router instance
started by the integration test works as expected.

This is a temporary solution to fix some flakiness in the tests as a
better solution would not depend so heavily on non-deterministic timings
of the router.
Document Redis TLS configuration

Resolves https://apollographql.atlassian.net/browse/DOC-62

---------

Co-authored-by: Geoffroy Couprie <apollo@geoffroycouprie.com>
Follow-up to the v1.41.0 being officially released, bringing version
bumps and changelog updates into the `dev` branch.
Previously, the integration test would spin up the router with its
listener always binding to port 4000, which might cause tests to fail if
a different process were to bind to the same port.

This commit makes the test router always bind to an available port and
corrects all calls to the router to use this new port.

Fixes #4720
Follow-up to the v1.41.1 being officially released, bringing version
bumps and changelog updates into the `dev` branch.
This PR changes the way we're sending chunks in the stream. Instead of
finishing the chunk with `\r\n` we don't send this at the end of our
current chunk but instead at the beginning of the next one. For the end
users nothing changes but it let us to close the stream with the right
final boundary by appending `--\r\n` directly to the last chunk.

Fixes #4634 

---------

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
Otel internally does a case-insensitive compare, so it's just a minor
code improvement that doesn't affect existing tests for spans in any
way.

<!-- start metadata -->
---

**Checklist**

Complete the checklist (and note appropriate exceptions) before the PR
is marked ready-for-review.

- [ ] Changes are compatible[^1]
- [ ] Documentation[^2] completed
- [ ] Performance impact assessed and acceptable
- Tests added and passing[^3]
    - [ ] Unit Tests
    - [ ] Integration Tests
    - [ ] Manual Tests

**Exceptions**

*Note any exceptions here*

**Notes**

[^1]: It may be appropriate to bring upcoming changes to the attention
of other (impacted) groups. Please endeavour to do this before seeking
PR approval. The mechanism for doing this will vary considerably, so use
your judgement as to how and when to do this.
[^2]: Configuration is an important part of many changes. Where
applicable please try to document configuration examples.
[^3]: Tick whichever testing boxes are applicable. If you are adding
Manual Tests, please document the manual testing (extensively) in the
Exceptions.

Co-authored-by: bryn <bryn@apollographql.com>
Update the helm chart to allow specifying container resource HPA targets
(https://kubernetes.io/blog/2023/05/02/hpa-container-resource-metric/).
This functionality is only available in Kubernetes 1.27+ by default.
Multiple containers and criteria types can be configured.

This supports the use case when the router is running with extra
containers that have independent scaling constraints from the overall
pod usage.

<!-- start metadata -->
---

**Checklist**

Complete the checklist (and note appropriate exceptions) before the PR
is marked ready-for-review.

- [X] Changes are compatible[^1]
- [x] Documentation[^2] completed
- [x] Performance impact assessed and acceptable
- Tests added and passing[^3]
    - [ ] Unit Tests
    - [ ] Integration Tests
    - [ ] Manual Tests

**Exceptions**

*Note any exceptions here*

**Notes**

[^1]: It may be appropriate to bring upcoming changes to the attention
of other (impacted) groups. Please endeavour to do this before seeking
PR approval. The mechanism for doing this will vary considerably, so use
your judgement as to how and when to do this.
[^2]: Configuration is an important part of many changes. Where
applicable please try to document configuration examples.
[^3]: Tick whichever testing boxes are applicable. If you are adding
Manual Tests, please document the manual testing (extensively) in the
Exceptions.
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
This adds the standardisation code that I used successfully for harness
runs, so we should get quite robust results.

I tested this manually by adding a `panic!()` in the mismatch case and
intentionally adding a bug in the fed-next implementation locally.

Ref #4649

<!-- start metadata -->
---

**Checklist**

Complete the checklist (and note appropriate exceptions) before the PR
is marked ready-for-review.

- [ ] Changes are compatible[^1]
- [x] Documentation[^2] completed
- [ ] Performance impact assessed and acceptable
- Tests added and passing[^3]
    - [ ] Unit Tests
    - [ ] Integration Tests
    - [x] Manual Tests

**Exceptions**

*Note any exceptions here*

**Notes**

[^1]: It may be appropriate to bring upcoming changes to the attention
of other (impacted) groups. Please endeavour to do this before seeking
PR approval. The mechanism for doing this will vary considerably, so use
your judgement as to how and when to do this.
[^2]: Configuration is an important part of many changes. Where
applicable please try to document configuration examples.
[^3]: Tick whichever testing boxes are applicable. If you are adding
Manual Tests, please document the manual testing (extensively) in the
Exceptions.
Fix #4473

We want the query plan cache to act like a LRU, so if a TTL is set in
its Redis configuration, we will reset the plan cache key's current
expiration when reading it.

This is now part of the common redis configuration and is enabled by default. For entity caching, this option is
disabled, as the cache will manage TTL directly

Integration tests now require Redis 7. Otherwise you will get the error "ERR unknown command `EXPIRETIME`"
…4775 (#4797)

This preallocates output object size in response formatting, bringing a
small performance improvement

Co-authored-by: Marc-Andre Giroux <mgiroux@netflix.com>
Cloning the UsageReporting structure to use in extensions
(https://github.com/apollographql/router/blob/dev/apollo-router/src/query_planner/caching_query_planner.rs#L390)
has a high cost. This wraps the structure in an `Arc` to prevent deep
clones

Co-authored-by: Marc-Andre Giroux <mgiroux@netflix.com>
Co-authored-by: Geoffroy Couprie <apollo@geoffroycouprie.com>
Fix #3504

This implements support for Unix sockets in subgraph connections by
specifying URLs in the `unix:///path/to/router.sock` format.

Implementation details:
- the Uri type used throughout the router does not actually support the
`unix://` scheme, and there is no real standard for it, so the library
we use here hides the socket path by hex encoding it in the authority
part of the Uri
- this only supports stream sockets
- decompression is supported
- TLS is not supported

Co-authored-by: Lenny Burdette <lenny@apollographql.com>
Follow-up to the v1.42.0 being officially released, bringing version
bumps and changelog updates into the `dev` branch.
This unbreaks e.g. `cargo test -- -q`

<!-- start metadata -->
---

**Checklist**

Complete the checklist (and note appropriate exceptions) before the PR
is marked ready-for-review.

- [x] Changes are compatible[^1]
- [ ] Documentation[^2] completed
- [ ] Performance impact assessed and acceptable
- Tests added and passing[^3]
    - [ ] Unit Tests
    - [ ] Integration Tests
    - [ ] Manual Tests

**Exceptions**

*Note any exceptions here*

**Notes**

[^1]: It may be appropriate to bring upcoming changes to the attention
of other (impacted) groups. Please endeavour to do this before seeking
PR approval. The mechanism for doing this will vary considerably, so use
your judgement as to how and when to do this.
[^2]: Configuration is an important part of many changes. Where
applicable please try to document configuration examples.
[^3]: Tick whichever testing boxes are applicable. If you are adding
Manual Tests, please document the manual testing (extensively) in the
Exceptions.
Now that UsageReporting is stored in an Arc, we can just clone it and
drop the lock, then insert what we want in the context outside of the
lock

This fixes a bug often triggerd in CI: 

```
thread 'test_plugin_ordering' panicked at apollo-router/src/context/extensions/sync.rs:66:21:
ExtensionsGuard held for 11ms. This is probably a bug that will stall the Router and cause performance problems. Run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
Zipkin exporter ignores service-name from the trace config. We have to
explicitly set it.

Added integration test for zipkin

Fixes #4807


<!-- start metadata -->
---

**Checklist**

Complete the checklist (and note appropriate exceptions) before the PR
is marked ready-for-review.

- [ ] Changes are compatible[^1]
- [ ] Documentation[^2] completed
- [ ] Performance impact assessed and acceptable
- Tests added and passing[^3]
    - [ ] Unit Tests
    - [ ] Integration Tests
    - [ ] Manual Tests

**Exceptions**

*Note any exceptions here*

**Notes**

[^1]: It may be appropriate to bring upcoming changes to the attention
of other (impacted) groups. Please endeavour to do this before seeking
PR approval. The mechanism for doing this will vary considerably, so use
your judgement as to how and when to do this.
[^2]: Configuration is an important part of many changes. Where
applicable please try to document configuration examples.
[^3]: Tick whichever testing boxes are applicable. If you are adding
Manual Tests, please document the manual testing (extensively) in the
Exceptions.

Co-authored-by: bryn <bryn@apollographql.com>
…figuring experimental_http2 (#4814)

Add table to clarify the resulting HTTP protocol of a subgraph
connection when configuring `experimental_http2`.

Resolves [DOC-11](https://apollographql.atlassian.net/browse/DOC-11)

[DOC-11]:
https://apollographql.atlassian.net/browse/DOC-11?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
@router-perf
Copy link

router-perf bot commented Mar 19, 2024

CI performance tests

  • reload - Reload test over a long period of time at a constant rate of users
  • events_big_cap_high_rate_callback - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity using callback mode
  • events_without_dedup_callback - Stress test for events with a lot of users and deduplication DISABLED using callback mode
  • large-request - Stress test with a 1 MB request payload
  • const - Basic stress test that runs with a constant number of users
  • no-graphos - Basic stress test, no GraphOS.
  • step-jemalloc-tuning - Clone of the basic stress test for jemalloc tuning
  • events - Stress test for events with a lot of users and deduplication ENABLED
  • events_callback - Stress test for events with a lot of users and deduplication ENABLED in callback mode
  • events_big_cap_high_rate - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity
  • events_without_dedup - Stress test for events with a lot of users and deduplication DISABLED
  • xxlarge-request - Stress test with 100 MB request payload
  • xlarge-request - Stress test with 10 MB request payload
  • step - Basic stress test that steps up the number of users over time

BrynCooke and others added 4 commits March 19, 2024 11:46
Clippy is complaining

Co-authored-by: bryn <bryn@apollographql.com>
…#4802)

Add the ability to enable heartbeat for cases where the subgraph drops
idle connections.
For example, https://netflix.github.io/dgs/

Example of configuration:

```yaml
subscription:
  mode:
    passthrough:
      all:
        path: /graphql
        heartbeat_interval: enable #Optional
 ```

Fixes #4621

<!-- start metadata -->
---

**Checklist**

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

- [x] Changes are compatible[^1]
- [x] Documentation[^2] completed
- [x] Performance impact assessed and acceptable
- Tests added and passing[^3]
    - [x] Unit Tests
    - [ ] Integration Tests
    - [x] Manual Tests

**Exceptions**

*Note any exceptions here*

**Notes**

[^1]: It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.
[^2]: Configuration is an important part of many changes. Where applicable please try to document configuration examples.
[^3]: Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

---------

Co-authored-by: Jesse Rosenberger <git@jro.cc>
.changesets/feat_add_container_scaling_config.md Outdated Show resolved Hide resolved
.changesets/feat_geal_jwks_headers.md Outdated Show resolved Hide resolved
.changesets/feat_geal_jwks_headers.md Outdated Show resolved Hide resolved
.changesets/feat_geal_jwt_source.md Outdated Show resolved Hide resolved
.changesets/feat_geal_query_plan_refresh_ttl.md Outdated Show resolved Hide resolved
.changesets/feat_geal_subgraph_unix_socket.md Outdated Show resolved Hide resolved
.changesets/feat_subscription_websocket_heartbeat.md Outdated Show resolved Hide resolved
.changesets/fix_bnjjj_multipart_test.md Outdated Show resolved Hide resolved
.changesets/fix_bryn_zipkin_service_name.md Outdated Show resolved Hide resolved
.changesets/fix_format_response_index_map_with_capacity.md Outdated Show resolved Hide resolved
abernix added a commit that referenced this pull request Mar 21, 2024
Including my best attempt as a human to merge the suggestions from @shorgi
on #4821 into my own.
Copy link
Member Author

@abernix abernix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved all of @shorgi 's edits to #4831 and applied them as cleanly as I could, so marking them all as resolved here.

@abernix abernix marked this pull request as ready for review March 22, 2024 09:01
@abernix abernix requested a review from a team as a code owner March 22, 2024 09:01
@abernix abernix enabled auto-merge March 22, 2024 09:04
@abernix abernix merged commit 4b84858 into main Mar 22, 2024
13 checks passed
@abernix abernix deleted the 1.43.0 branch March 22, 2024 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet