Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argon2 password hashing leads to increased Major GC's in Keycloak's JVM during load tests #29033

Closed
2 tasks done
kami619 opened this issue Apr 23, 2024 · 3 comments · Fixed by #29369 or #29521
Closed
2 tasks done
Assignees
Labels
area/authentication Indicates an issue on Authentication area kind/bug Categorizes a PR related to a bug kind/performance Issue identified as a performance issues kind/regression priority/blocker Highest Priority. Has a deadline and it blocks other tasks release/25.0.0 status/reopened team/cross-dc
Milestone

Comments

@kami619
Copy link
Contributor

kami619 commented Apr 23, 2024

Before reporting an issue

  • I have read and understood the above terms for submitting issues, and I understand that my issue may be closed without action if I do not follow them.

Area

dist/quarkus

Describe the bug

Argon2 password hashing leads to increased Major GC's in Keycloak's JVM during load tests.

With Argon2 occupying larger amounts of heap memory and with the existing GC(ParallelGC) results in continuous Major GC's leading for higher CPU contention and lower performance across the board.

We looked into the tuning options and did several series of tests. And the findings are below.
But we understand there is an intervention needed due to this password hashing algorithm change and there are couple of recommendations suggested in this ticket based on our tests.

Version

nightly

Regression

  • The issue is a regression

Expected behavior

We would want the Keycloak's JVM heap to behave consistently in terms of both throughput and endpoint performance.

Actual behavior

With the Argon2 hashing change, we find the application to behave abnormally during medium to high load situations with increased JVM GC overhead and CPU utilization.

With the Keycloak current JVM defaults:

High Major GC counts during a load test run and higher JVM GC Overhead.

screencapture-grafana-apps-rosa-gh-keycloak-a-0u56-p3-openshiftapps-d-basic-keycloak-dashboard-by-namespace-keycloak-perf-tests-2024-04-11-10_18_32

How to Reproduce?

Using the keycloak-benchmark and the default Keycloak JVM heap settings, run the below command

./kcb.sh --scenario=keycloak.scenario.authentication.AuthorizationCode \
--server-url=<keycloak-url> --users-per-sec=150 --ramp-up=20 \
--logout-percentage=100 --measurement=900 --users-per-realm=20000

Anything else?

We then disabled the UseAdaptiveSizePolicy and that stabilized the load test runs. Further more we observed the Adaptive policy is aggressively looking to clear the heap resulting in frequent heap resizing events, but with disabling the adaptive policy, we were able to control that better. But disabling a key attribute is not a longer term solution, now we started to look into the G1GC which suits better to our use case and ran few experiments. Below is the result from those experiments.

Screenshot 2024-04-23 at 16 04 19

Based on this data, it seems like, G1GC even with current Keycloak JVM defaults works better than ParallelGC and it offers some more tuning bandwidth by adjusting GCTimeRatio and AdaptiveSizePolicyWeight. We would like to hear how the Cloud Native team interprets this data and take appropriate next steps.

@kami619 kami619 added kind/bug Categorizes a PR related to a bug status/triage labels Apr 23, 2024
@kami619 kami619 added kind/performance Issue identified as a performance issues and removed area/dist/quarkus team/cloud-native labels Apr 23, 2024
@kami619 kami619 added this to the 25.0.0 milestone Apr 23, 2024
@sschu
Copy link
Contributor

sschu commented Apr 24, 2024

@kami619 As a side note: we have been using G1C in production since we started using Keycloak without problems.

@keycloak-github-bot keycloak-github-bot bot added kind/regression priority/blocker Highest Priority. Has a deadline and it blocks other tasks and removed status/triage action/priority-regression labels Apr 25, 2024
@stianst stianst added team/cross-dc area/authentication Indicates an issue on Authentication area labels Apr 25, 2024
@ahus1
Copy link
Contributor

ahus1 commented May 7, 2024

Steven created an issue in the bouncy castle upstream project to optimize the memory usage of Argon2. We don't know if and when this will be implemented. Therefore, we will need to continue changing the GC settings.

kami619 added a commit to kami619/keycloak that referenced this issue May 7, 2024
Update the default GC from ParallelGC to G1GC

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
kami619 added a commit to kami619/keycloak that referenced this issue May 7, 2024
Update the default GC from ParallelGC to G1GC

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
kami619 added a commit to kami619/keycloak that referenced this issue May 7, 2024
Update the default GC from ParallelGC to G1GC

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
kami619 added a commit to kami619/keycloak that referenced this issue May 7, 2024
Update the default GC from ParallelGC to G1GC

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
kami619 added a commit to kami619/keycloak that referenced this issue May 8, 2024
Update the default GC from ParallelGC to G1GC

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
kami619 added a commit to kami619/keycloak that referenced this issue May 8, 2024
Update the default GC from ParallelGC to G1GC

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
kami619 added a commit to kami619/keycloak that referenced this issue May 8, 2024
Update the default GC from ParallelGC to G1GC

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
ahus1 added a commit to kami619/keycloak that referenced this issue May 8, 2024
Closes keycloak#29033

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
Co-authored-by: Alexander Schwartz <aschwart@redhat.com>
kami619 added a commit to kami619/keycloak that referenced this issue May 8, 2024
Closes keycloak#29033

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
Co-authored-by: Alexander Schwartz <aschwart@redhat.com>
kami619 added a commit to kami619/keycloak that referenced this issue May 8, 2024
Closes keycloak#29033

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
Co-authored-by: Alexander Schwartz <aschwart@redhat.com>
ahus1 added a commit that referenced this issue May 8, 2024
Closes #29033

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
Co-authored-by: Alexander Schwartz <aschwart@redhat.com>
@ahus1 ahus1 reopened this May 8, 2024
@ahus1
Copy link
Contributor

ahus1 commented May 8, 2024

Re-opened as we need release notes and an updated sizing guide (the sizing guide doesn't reflect Argon2 yet)

ahus1 added a commit to kami619/keycloak that referenced this issue May 14, 2024
Closes keycloak#29033

Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
ahus1 added a commit that referenced this issue May 14, 2024
Closes #29033

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
Co-authored-by: Alexander Schwartz <alexander.schwartz@gmx.net>
Co-authored-by: Václav Muzikář <vaclav@muzikari.cz>
Co-authored-by: Alexander Schwartz <aschwart@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/authentication Indicates an issue on Authentication area kind/bug Categorizes a PR related to a bug kind/performance Issue identified as a performance issues kind/regression priority/blocker Highest Priority. Has a deadline and it blocks other tasks release/25.0.0 status/reopened team/cross-dc
Projects
None yet
4 participants