Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

very big stack when optimisation disabled #1144

Open
trinity-1686a opened this issue May 10, 2024 · 2 comments
Open

very big stack when optimisation disabled #1144

trinity-1686a opened this issue May 10, 2024 · 2 comments
Labels
bug This issue is a bug. p3 This is a minor priority issue

Comments

@trinity-1686a
Copy link

trinity-1686a commented May 10, 2024

Describe the bug

after upgrading aws sdk from 0.x to 1.x, we started seeing stack overflows in our tests. After investigation, it seems aws_sdk_s3::config::endpoint::internals::resolve_endpoint creates a rather large frame, a bit over 1MiB. I haven't measured in release, but we don't see a stack overflow there, so it's likely optimizations improves that considerably

Expected Behavior

a more tame usage of the stack

Current Behavior

depending on how deep the stack is already, when calling into s3 sdk, possibly a stack overflow. No problem if there is enough place on the stack

Reproduction Steps

a minimal reproducer is hard to come by given you need an already fairly large stack to get a stack overflow. In https://github.com/quickwit-oss/quickwit/ before e8978dff7e9fe831f721785cf88f5ff03f0359bc, you can run make test-all to get the same error we got

Possible Solution

we increased RUST_MIN_STACK as a workaround.
Splitting aws_sdk_s3::config::endpoint::internals::resolve_endpoint (currently over 4300 lines) into smaller functions would likely improve its stack usage

Additional Information/Context

looks similar to #1082

Version

quickwit-aws v0.8.0 (/home/trinity/dev/quickwit/quickwit/quickwit-aws)
├── aws-config v1.2.1
│   ├── aws-credential-types v1.2.0
│   │   ├── aws-smithy-async v1.2.1
│   │   ├── aws-smithy-runtime-api v1.4.0
│   │   │   ├── aws-smithy-async v1.2.1 (*)
│   │   │   ├── aws-smithy-types v1.1.8
│   │   ├── aws-smithy-types v1.1.8 (*)
│   ├── aws-runtime v1.2.0
│   │   ├── aws-credential-types v1.2.0 (*)
│   │   ├── aws-sigv4 v1.2.1
│   │   │   ├── aws-credential-types v1.2.0 (*)
│   │   │   ├── aws-smithy-eventstream v0.60.4
│   │   │   │   ├── aws-smithy-types v1.1.8 (*)
│   │   │   ├── aws-smithy-http v0.60.8
│   │   │   │   ├── aws-smithy-eventstream v0.60.4 (*)
│   │   │   │   ├── aws-smithy-runtime-api v1.4.0 (*)
│   │   │   │   ├── aws-smithy-types v1.1.8 (*)
│   │   │   ├── aws-smithy-runtime-api v1.4.0 (*)
│   │   │   ├── aws-smithy-types v1.1.8 (*)
│   │   ├── aws-smithy-async v1.2.1 (*)
│   │   ├── aws-smithy-eventstream v0.60.4 (*)
│   │   ├── aws-smithy-http v0.60.8 (*)
│   │   ├── aws-smithy-runtime-api v1.4.0 (*)
│   │   ├── aws-smithy-types v1.1.8 (*)
│   │   ├── aws-types v1.2.0
│   │   │   ├── aws-credential-types v1.2.0 (*)
│   │   │   ├── aws-smithy-async v1.2.1 (*)
│   │   │   ├── aws-smithy-runtime-api v1.4.0 (*)
│   │   │   ├── aws-smithy-types v1.1.8 (*)
│   ├── aws-sdk-sso v1.21.0
│   │   ├── aws-credential-types v1.2.0 (*)
│   │   ├── aws-runtime v1.2.0 (*)
│   │   ├── aws-smithy-async v1.2.1 (*)
│   │   ├── aws-smithy-http v0.60.8 (*)
│   │   ├── aws-smithy-json v0.60.7
│   │   │   └── aws-smithy-types v1.1.8 (*)
│   │   ├── aws-smithy-runtime v1.3.1
│   │   │   ├── aws-smithy-async v1.2.1 (*)
│   │   │   ├── aws-smithy-http v0.60.8 (*)
│   │   │   ├── aws-smithy-protocol-test v0.60.7
│   │   │   │   ├── aws-smithy-runtime-api v1.4.0 (*)
│   │   │   ├── aws-smithy-runtime-api v1.4.0 (*)
│   │   │   ├── aws-smithy-types v1.1.8 (*)
│   │   ├── aws-smithy-runtime-api v1.4.0 (*)
│   │   ├── aws-smithy-types v1.1.8 (*)
│   │   ├── aws-types v1.2.0 (*)
│   ├── aws-sdk-ssooidc v1.21.0
│   │   ├── aws-credential-types v1.2.0 (*)
│   │   ├── aws-runtime v1.2.0 (*)
│   │   ├── aws-smithy-async v1.2.1 (*)
│   │   ├── aws-smithy-http v0.60.8 (*)
│   │   ├── aws-smithy-json v0.60.7 (*)
│   │   ├── aws-smithy-runtime v1.3.1 (*)
│   │   ├── aws-smithy-runtime-api v1.4.0 (*)
│   │   ├── aws-smithy-types v1.1.8 (*)
│   │   ├── aws-types v1.2.0 (*)
│   ├── aws-sdk-sts v1.21.0
│   │   ├── aws-credential-types v1.2.0 (*)
│   │   ├── aws-runtime v1.2.0 (*)
│   │   ├── aws-smithy-async v1.2.1 (*)
│   │   ├── aws-smithy-http v0.60.8 (*)
│   │   ├── aws-smithy-json v0.60.7 (*)
│   │   ├── aws-smithy-query v0.60.7
│   │   │   ├── aws-smithy-types v1.1.8 (*)
│   │   ├── aws-smithy-runtime v1.3.1 (*)
│   │   ├── aws-smithy-runtime-api v1.4.0 (*)
│   │   ├── aws-smithy-types v1.1.8 (*)
│   │   ├── aws-smithy-xml v0.60.8
│   │   ├── aws-types v1.2.0 (*)
│   ├── aws-smithy-async v1.2.1 (*)
│   ├── aws-smithy-http v0.60.8 (*)
│   ├── aws-smithy-json v0.60.7 (*)
│   ├── aws-smithy-runtime v1.3.1 (*)
│   ├── aws-smithy-runtime-api v1.4.0 (*)
│   ├── aws-smithy-types v1.1.8 (*)
│   ├── aws-types v1.2.0 (*)
├── aws-sdk-s3 v1.24.0
│   ├── aws-credential-types v1.2.0 (*)
│   ├── aws-runtime v1.2.0 (*)
│   ├── aws-sigv4 v1.2.1 (*)
│   ├── aws-smithy-async v1.2.1 (*)
│   ├── aws-smithy-checksums v0.60.7
│   │   ├── aws-smithy-http v0.60.8 (*)
│   │   ├── aws-smithy-types v1.1.8 (*)
│   ├── aws-smithy-eventstream v0.60.4 (*)
│   ├── aws-smithy-http v0.60.8 (*)
│   ├── aws-smithy-json v0.60.7 (*)
│   ├── aws-smithy-runtime v1.3.1 (*)
│   ├── aws-smithy-runtime-api v1.4.0 (*)
│   ├── aws-smithy-types v1.1.8 (*)
│   ├── aws-smithy-xml v0.60.8 (*)
│   ├── aws-types v1.2.0 (*)
├── aws-smithy-async v1.2.1 (*)
├── aws-smithy-runtime v1.3.1 (*)
├── aws-types v1.2.0 (*)
│   │   ├── quickwit-aws v0.8.0 (/home/trinity/dev/quickwit/quickwit/quickwit-aws) (*)
│   │   │   │   ├── aws-config v1.2.1 (*)
│   │   │   │   ├── aws-credential-types v1.2.0 (*)
│   │   │   │   ├── aws-sdk-s3 v1.24.0 (*)
│   │   │   │   ├── aws-smithy-types v1.1.8 (*)
│   │   │   │   ├── quickwit-aws v0.8.0 (/home/trinity/dev/quickwit/quickwit/quickwit-aws) (*)
│   │   │   │   ├── aws-sdk-s3 v1.24.0 (*)
│   │   │   │   ├── aws-smithy-runtime v1.3.1 (*)

Environment details (OS name and version, etc.)

archlinux, github-actions...

Logs

thread 'test_all_with_s3_localstack_cli' has overflowed its stack
fatal runtime error: stack overflow

   Canceling due to test failure: 7 tests still running
@trinity-1686a trinity-1686a added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 10, 2024
@Velfi Velfi added p3 This is a minor priority issue and removed needs-triage This issue or PR still needs to be triaged. labels May 10, 2024
@Velfi
Copy link
Contributor

Velfi commented May 10, 2024

Hey @trinity-1686a, thanks for submitting this issue.

Endpoint resolution logic is generated from a set of rules defined by the service team. S3 is a particularly complex example. My team is essentially writing a compiler but we aren't doing a good job when it comes to optimizing the output of that compiler.

The codegen code is located here. At a glance, I can see we're disabling some Clippy lints. Perhaps we could re-enable those and then run clippy --fix at the end of generation to get rid of some of the cruft.

I don't expect us to get to this issue soon, but I do think it's something that we should put in our backlog.

@Velfi
Copy link
Contributor

Velfi commented May 10, 2024

From a quick test, running cargo clippy --tests --all-features --fix on the S3 endpoint resolver brought the total lines down from 4,258 to 3,810 lines.

However, I also tested whether that solves this problem and the answer is no, it still overflows. 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p3 This is a minor priority issue
Projects
None yet
Development

No branches or pull requests

2 participants