Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU usage on startup #8995

Open
ruslansuprun opened this issue Apr 16, 2024 · 13 comments
Open

High CPU usage on startup #8995

ruslansuprun opened this issue Apr 16, 2024 · 13 comments
Assignees
Labels
Profiling Continous Profling

Comments

@ruslansuprun
Copy link

ruslansuprun commented Apr 16, 2024

Summary of problem

During start of our applications CPU usage is going 3x from usual utilization, after few mins it returns back to normal state, but it does affect our autoscaling.
When we are starting app without ddtrace-run command CPU usage spikes on start does not appear.

Python version: 3.12

Which version of dd-trace-py are you using?

2.8.1

Which version of pip are you using?

24.0

Which libraries and their versions are you using?

`pip freeze`
aiohttp==3.9.3
aiosignal==1.3.1
amqp==5.2.0
annotated-types==0.6.0
anyio==4.3.0
asgiref==3.7.2
attrs==23.2.0
Babel==2.14.0
beautifulsoup4==4.12.3
billiard==4.2.0
boto3==1.34.71
botocore==1.34.71
bytecode==0.15.1
cachetools==5.3.3
cattrs==23.2.3
celery==5.3.6
certifi==2024.2.2
cffi==1.16.0
channels==4.0.0
channels-demultiplexer==2.2.0
channels-redis==4.1.0
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.3.0
clickhouse-driver==0.2.7
cron-descriptor==1.4.3
cryptography==42.0.5
curlify==2.2.1
ddsketch==3.0.1
ddtrace==2.8.1
defusedxml==0.7.1
Deprecated==1.2.14
dicttoxml==1.7.16
Django==5.0.3
django-admin-autocomplete-filter==0.7.1
django-annoying==0.10.6
django-appconf==1.0.6
django-celery-beat==2.6.0
django-celery-results==2.3.1
django-cors-headers==4.3.1
django-debug-toolbar==4.3.0
django-deprecate-fields==0.1.2
django-dirtyfields==1.9.2
django-elasticsearch-dsl==7.4
django-elasticsearch-dsl-drf==0.22.5
django-environ==0.11.2
django-extensions==3.2.3
django-filter==24.1
django-fsm==2.8.1
django-health-check==3.18.1
django-model-utils==4.4.0
django-nine==0.2.7
django-redis==5.4.0
django-redis-sessions==0.6.2
django-redis-sessions-fork==0.7.0
django-split-settings==1.3.0
django-storages==1.14.2
django-timezone-field==6.1.0
djangorestframework==3.14.0
djangorestframework-simplejwt==5.3.1
drf-extensions==0.7.1
drf-spectacular==0.27.1
elasticsearch==7.13.4
elasticsearch-dsl==7.4.1
envier==0.5.1
facebook_business==19.0.2
factory-boy==3.3.0
Faker==23.3.0
fakeredis==2.21.3
frozenlist==1.4.1
ga4mp==2.0.4
google-auth==2.29.0
google-measurement-protocol==1.1.0
growthbook==1.0.0
gunicorn==21.2.0
h11==0.14.0
hiredis==2.3.2
httptools==0.6.1
hurry.filesize==0.9
idna==3.6
importlib-metadata==6.11.0
inflection==0.5.1
jmespath==0.10.0
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
kombu==5.3.5
lupa==2.0
maxminddb==2.6.0
maxminddb-geolite2==2018.703
mongoengine==0.27.0
mongomock==4.1.2
msgpack==1.0.8
msgspec==0.18.6
multidict==6.0.5
oauthlib==3.2.2
opentelemetry-api==1.23.0
packaging==24.0
paypalrestsdk==1.13.3
phonenumbers==8.13.33
pillow==10.2.0
prices==1.1.1
prompt-toolkit==3.0.43
protobuf==5.26.0
psutil==5.9.8
psycopg==3.1.18
psycopg-binary==3.1.18
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycountry==23.12.11
pycparser==2.21
pycryptodome==3.20.0
pydantic==2.6.4
pydantic_core==2.16.3
PyJWT==2.8.0
pymongo==3.13.0
pyOpenSSL==24.1.0
pyotp==2.9.0
python-crontab==3.0.0
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-http-client==3.3.7
python-json-logger==2.0.7
python3-openid==3.2.0
pytz==2024.1
PyYAML==6.0.1
redis==5.0.3
referencing==0.34.0
reportlab==4.1.0
requests==2.31.0
requests-oauthlib==1.4.0
rpds-py==0.18.0
rsa==4.9
s3transfer==0.10.1
sendgrid==6.11.0
sentinels==1.0.0
sentry-sdk==1.43.0
setuptools==69.2.0
six==1.16.0
sniffio==1.3.1
social-auth-app-django==5.4.0
social-auth-core==4.5.3
solidgate-card-sdk==0.1.0
sortedcontainers==2.4.0
soupsieve==2.5
sqlparse==0.4.4
starkbank-ecdsa==2.2.0
stripe==8.8.0
twilio==7.17.0
typing_extensions==4.10.0
tzdata==2024.1
tzlocal==5.2
ua-parser==0.18.0
ujson==5.9.0
uritemplate==4.1.1
urllib3==1.26.18
user-agents==2.2.0
uvicorn==0.27.1
uvloop==0.19.0
vine==5.1.0
watchfiles==0.21.0
wcwidth==0.2.13
websockets==12.0
wrapt==1.16.0
xmltodict==0.13.0
yarl==1.9.4
zipp==3.18.1

How can we reproduce your problem?

Run big app with ddtrace-run

What is the result that you get?

High CPU usage on application start

What is the result that you expected?

CPU usage spike on start is not happening or at least decreased as much as possible

@gnufede
Copy link
Member

gnufede commented Apr 16, 2024

Hello @ruslansuprun thank you for reporting this issue.

Can you tell us what products or environment variables you have enabled? For example DD_IAST_ENABLED or DD_PROFILING_ENABLED

@ruslansuprun
Copy link
Author

Sure
Here is set of DD_* env variables that we are using:

DD_AGENT_HOST: xxx
DD_SERVICE: xxx
DD_VERSION: xxx
DD_ENV: xxx
DD_LOGS_INJECTION: true
DD_PROFILING_ENABLED: true
DD_SERVICE_MAPPING: xxx
DD_TRACE_ENABLED: true
DD_TRACE_SAMPLE_RATE: 0.1

@gnufede
Copy link
Member

gnufede commented Apr 16, 2024

Thank you,

This CPU spike, does it also happen if you disable DD_LOGS_INJECTION or DD_PROFILING_ENABLED?

The spike does not happen in 2.8.0 with the same env variables?

@ruslansuprun
Copy link
Author

It's happening when DD_PROFILING_ENABLED is true, when it's false CPU usage spikes not happening

DD_LOGS_INJECTION value does not affect anything related to CPU usage

@ruslansuprun
Copy link
Author

We've upgraded from 2.7.4 and issue was present there too
So we've decided to upgrade to latest version to see if issue is still there

@gnufede gnufede added the Profiling Continous Profling label Apr 16, 2024
@sanchda sanchda self-assigned this Apr 16, 2024
@sanchda
Copy link
Contributor

sanchda commented Apr 16, 2024

@ruslansuprun Definitely looks like a profiling issue. Since you're on a recent version of 2.7/2.8, could you try enabling the DD_PROFILING_EXPORT_LIBDD_ENABLED=true environment variable? Other customers with similar issues on 3.12 have seen improvement with this. Thank you!

@ruslansuprun
Copy link
Author

@sanchda so this new env variable should be added along with DD_PROFILING_ENABLED=true or replacing this env variable with DD_PROFILING_EXPORT_LIBDD_ENABLED=true?

@sanchda
Copy link
Contributor

sanchda commented Apr 16, 2024

@ruslansuprun It's in addition to. It modifies the behavior of the profiler, but the profiler won't be enabled without specifying it explicitly.

@ruslansuprun
Copy link
Author

@sanchda tried this, small improvements are in place, but still CPU usage on start is too high 😢
Disabled profiling for now

@sanchda
Copy link
Contributor

sanchda commented Apr 22, 2024

Did some head-scratching over this. @ruslansuprun, if you're up for another experiment, I see that in some cases the profiler startup time can be negatively impacted by code provenance ("My Code" in the UI). If you set DD_PROFILING_ENABLE_CODE_PROVENANCE=false, this might help startup times.

If it doesn't help, then I'd be very interested in trying to characterize how the profiler is wasting time. You may not be the only one with this issue.

@ruslansuprun
Copy link
Author

@sanchda sorry for a long response, will test it in few days and will return back with feedback

@ruslansuprun
Copy link
Author

@sanchda hi
I've tried enabling profiling with DD_PROFILING_ENABLE_CODE_PROVENANCE=false option but still no luck with decrease of CPU usage on start

@ruslansuprun
Copy link
Author

We are running 2 services
First one using gunicorn to run with 12 workers per container, workers amount is set by env variable
Options for gunicorn: gunicorn <app> -b 0.0.0.0:8000 -t 300 --max-requests 100000 --max-requests-jitter 20000 --keep-alive 60 --graceful-timeout 60 --preload

Second on is using uvicorn with 10 workers per container, workers amount is set by env variable
Options for uvicorn: uvicorn <app> --host 0.0.0.0 --port 5000 --no-access-log --proxy-headers --forwarded-allow-ips * --lifespan off --timeout-keep-alive 60 --timeout-graceful-shutdown 60

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Profiling Continous Profling
Projects
None yet
Development

No branches or pull requests

3 participants