Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gevent patching order is documented incorrectly #8763

Open
matjam opened this issue Mar 25, 2024 · 4 comments
Open

gevent patching order is documented incorrectly #8763

matjam opened this issue Mar 25, 2024 · 4 comments

Comments

@matjam
Copy link

matjam commented Mar 25, 2024

In an earlier issue

#3595 (comment)

it is noted that

The correct one is to patch gevent prior to ddtrace as there are conflicts with the threads ddtrace starts to submit traces/profiles.

yet there is a note in the documentation that states

If ddtrace-run is not being used then be sure to import ddtrace.auto before calling gevent.monkey.patch_all. If ddtrace-run is being used then no additional configuration is required.

However, when we do this, we get this error:

/Users/matjam/src/redacted/src/redacted/app/cli.py:43: MonkeyPatchWarning: Monkey-patching ssl after ssl has already
 been imported may lead to errors, including RecursionError on Python 3.6. It may also silently lead to incorrect behaviour 
on Python 3.7. Please monkey-patch earlier. See https://github.com/gevent/gevent/issues/1016. Modules that had direct 
imports (NOT patched): ['urllib3.contrib.pyopenssl (/Users/matjam/src/redacted/build/venv/redacted_macosx-14.2-
arm64_cp3.9/lib/python3.9/site-packages/urllib3/contrib/pyopenssl.py)', 'urllib3.util 
(/Users/matjam/src/redacted/build/venv/redacted_macosx-14.2-arm64_cp3.9/lib/python3.9/site-
packages/urllib3/util/__init__.py)', 'urllib3.util.ssl_ (/Users/matjam/src/redacted/build/venv/redacted_macosx-14.2-
arm64_cp3.9/lib/python3.9/site-packages/urllib3/util/ssl_.py)', 'botocore.httpsession 
(/Users/matjam/src/redacted/build/venv/redacted_macosx-14.2-arm64_cp3.9/lib/python3.9/site-
packages/botocore/httpsession.py)']. 
  monkey.patch_all(thread=False, select=False)

clearly, patching ddtrace BEFORE gevent WILL NOT WORK as there are dependencies inside ddtrace that are importing the ssl library.

If the correct order is indeed supposed to be ddtrace before gevent, then that means ddtrace needs to be smarter.

    if datadog_apm_enabled:
        log.info("datadog APM enabled, doing datadog monkey patching ...")
        from ddtrace import patch_all

        patch_all(gevent=gevent_enabled, psycopg=True, elasticsearch=True, urllib3=True, botocore=True)

    if gevent_enabled:
        log.info("gevent worker enabled, doing gevent monkey patching...")
        from gevent import monkey

        monkey.patch_all(thread=False)

        try:
            import psycogreen.gevent
        except ImportError:
            log.info("Could not find psycogreen dependency; psycopg2 calls will be blocking.")
        else:
            log.info("Found psycogreen dependency; automatically monkey patching psycopg2.")
            psycogreen.gevent.patch_psycopg()

if I reverse the above to patch gevent first, the above error goes away. Additionally the app starts.

@P403n1x87
Copy link
Contributor

@matjam As pointed out in the documentation, we recommend importing ddtrace.auto as the very first thing to ensure that modules are patched correctly, when ddtrace-run is not being used. In your excerpt above I cannot see the import ddtrace.auto, so could you please try something like

    if datadog_apm_enabled:
        import ddtrace.auto. # noqa

        log.info("datadog APM enabled, doing datadog monkey patching ...")
        from ddtrace import patch_all

        patch_all(gevent=gevent_enabled, psycopg=True, elasticsearch=True, urllib3=True, botocore=True)

and report whether that solved your problem? Thanks 🙏 .

@matjam
Copy link
Author

matjam commented Mar 26, 2024

@P403n1x87 What you're saying is, we need to do the following?

import ddtrace.auto
from ddtrace import patch_all

patch_all(gevent=True, psycopg=True, elasticsearch=True, urllib3=True, botocore=True)

from gevent import monkey

monkey.patch_all()

If so, the docs are not clear on this at all. Patching in this order results in the following startup errors:

/Users/matjam/src/appname/src/appname/app/cli.py:53: MonkeyPatchWarning: Monkey-patching ssl after ssl 
has already been imported may lead to errors, including RecursionError on Python 3.6. It may also silently lead to 
incorrect behaviour on Python 3.7. Please monkey-patch earlier. See https://github.com/gevent/gevent/issues/1016. 
Modules that had direct imports (NOT patched): [
'urllib3.contrib.pyopenssl (/Users/matjam/src/appname/build/venv/appname_macosx-14.2-arm64_cp3.9/lib/python3.9/site-packages/urllib3/contrib/pyopenssl.py)', 
'urllib3.util.ssl_ (/Users/matjam/src/appname/build/venv/appname_macosx-14.2-arm64_cp3.9/lib/python3.9/site-packages/urllib3/util/ssl_.py)', 
'urllib3.util (/Users/matjam/src/appname/build/venv/appname_macosx-14.2-arm64_cp3.9/lib/python3.9/site-packages/urllib3/util/__init__.py)', 
'botocore.httpsession (/Users/matjam/src/appname/build/venv/appname_macosx-14.2-arm64_cp3.9/lib/python3.9/site-packages/botocore/httpsession.py)'].

And then later when our config class tries to access KMS we get

context.options |= options
  File "/Users/matjam/.pyenv/versions/3.9.18/lib/python3.9/ssl.py", line 603, in options
    super(SSLContext, SSLContext).options.__set__(self, value)
  File "/Users/matjam/.pyenv/versions/3.9.18/lib/python3.9/ssl.py", line 603, in options
    super(SSLContext, SSLContext).options.__set__(self, value)
  File "/Users/matjam/.pyenv/versions/3.9.18/lib/python3.9/ssl.py", line 603, in options
    super(SSLContext, SSLContext).options.__set__(self, value)
  [Previous line repeated 459 more times]
RecursionError: maximum recursion depth exceeded

We're using 3.9.18. The gevent docs are clear in that if you see these errors its because you're patching things too late.

The above happens with the following startup code:

@click.command()
@click.option("--entry-name", default="app", type=click.STRING)
@click.argument("gunicorn_args", nargs=-1, type=click.UNPROCESSED)
def run_app(entry_name, gunicorn_args):
    # we need to patch gevent before anything at all is imported
    log.info("Starting Redacted ...")
    config = SimpleConfig(log)

    gevent_enabled: bool = config.get("GUNICORN.worker_class") == "gevent"
    datadog_apm_enabled: bool = config.get("DATADOG.APM_ENABLED")
    datadog_profiling_enabled: bool = config.get("DATADOG.PROFILING_ENABLED")

    if datadog_apm_enabled:
        log.info("datadog APM enabled, doing datadog monkey patching ...")
        import ddtrace.auto  # noqa
        from ddtrace import patch_all

        if datadog_profiling_enabled:
            log.info("Enabling Datadog Profiling...")
            import ddtrace.profiling.auto  # noqa

        patch_all(gevent=gevent_enabled, psycopg=True, elasticsearch=True, urllib3=True, botocore=True)

    if gevent_enabled:
        log.info("gevent worker enabled, doing gevent monkey patching...")
        from gevent import monkey
        import psycogreen.gevent

        monkey.patch_all()
        log.info("Using psycogreen to monkey patch psycopg2.")
        psycogreen.gevent.patch_psycopg()

    try:
        from . import RedactedApplication

        application = RedactedApplication(
            config.get("_meta.environment"),
            entry_name=entry_name,
            gunicorn_args=gunicorn_args,
        )
    except Exception as e:
        print(f"ERROR: exception starting RedactedAppliction: {e}")
        sys.exit(1)

    application.run()

note the SimpleConfig class is a simple JSON file reader, and does no IO other than filesystem IO to read the JSON config. Logging is configured to log to stdout.

I've tried moving the patch_all call after the gevent monkey.patch_all but still get the same behaviour.

@matjam
Copy link
Author

matjam commented Mar 26, 2024

So looks like there were some importsin __init__.py which was loading ssl and other things. I moved that code out into a separate package, and now we have the following startup code:

import logging
import os
import sys

import click

from .simpleconfig import SimpleConfig

# in our quest to ensure NOTHING but gevent gets to patch things first,
# we implement a very simple logger that only logs to stdout. This is
# replaced by the custom logger later.

formatter = logging.Formatter(
    fmt="[%(asctime)s] [%(process)d] [%(levelname)s] %(message)s", datefmt="%Y-%m-%d %H:%M:%S %z"
)

handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(formatter)

log = logging.getLogger(__name__)
log.addHandler(handler)
log.setLevel(logging.INFO)


@click.command()
@click.option("--entry-name", default="app", type=click.STRING)
@click.argument("gunicorn_args", nargs=-1, type=click.UNPROCESSED)
def run_app(entry_name, gunicorn_args):
    # based on a discussion in https://github.com/DataDog/dd-trace-py/issues/8763
    #
    # We import and patch with ddtrace.auto first, then we patch with the
    # ddtrace.patch function. This is because ddtrace.auto does not patch
    # urllib3 and botocore, which are used by the application.

    log.info("Starting Redacted ...")
    config = SimpleConfig(log)

    gevent_enabled: bool = config.get("GUNICORN.worker_class") == "gevent"
    datadog_apm_enabled: bool = config.get("DATADOG.APM_ENABLED")
    datadog_profiling_enabled: bool = config.get("DATADOG.PROFILING_ENABLED")

    if datadog_apm_enabled:
        log.info("Datadog APM enabled, doing datadog monkey patching ...")
        import ddtrace.auto  # noqa
        from ddtrace import patch

        patch(urllib3=True, botocore=True)

    if gevent_enabled:
        log.info("gevent worker enabled, doing gevent monkey patching...")
        import gevent.monkey
        import psycogreen.gevent

        gevent.monkey.patch_all(thread=False, socket=False)

        log.info("gevent monkey patching psycopg2 ...")
        psycogreen.gevent.patch_psycopg()

    if datadog_profiling_enabled:
        log.info("Enabling Datadog Profiling...")
        import ddtrace.profiling.auto  # noqa

    try:
        from .application import RedactedApplication

        application = RedactedApplication(
            config.get("_meta.environment"),
            entry_name=entry_name,
            gunicorn_args=gunicorn_args,
        )
    except Exception as e:
        print(f"ERROR: exception starting RedactedAppliction: {e}")
        sys.exit(1)

    log.info("Running application.")
    application.run()

This seems to work, though we do see one warning now:

/Users/matjam/src/redacted/build/venv/redacted_macosx-14.2-arm64_cp3.9/lib/python3.9
  /site-packages/gunicorn/workers/ggevent.py:39: MonkeyPatchWarning: Patching more than 
  once will result in the union of all True parameters being patched

When I added tracing to the gevent patch_all function, I see that it is called twice, once from the code above and a second time when the gunicorn worker is started in ggevent.py:

class GeventWorker(AsyncWorker):

    server_class = None
    wsgi_handler = None

    def patch(self):
        monkey.patch_all()

I believe this is harmless.

This change does remove an exception we were seeing at startup:

[2024-03-26 12:37:06 -0700] [309] [INFO] Using worker: gevent
Traceback (most recent call last):
  File "src/gevent/_abstract_linkable.py", line 287, in gevent._gevent_c_abstract_linkable.AbstractLinkable._notify_links
  File "src/gevent/_abstract_linkable.py", line 333, in gevent._gevent_c_abstract_linkable.AbstractLinkable._notify_links
AssertionError: (None, <callback at 0x121ea5340 args=([],)>)
2024-03-26T19:37:06Z <callback at 0x121ea5340 args=([],)> failed with AssertionError

@matjam
Copy link
Author

matjam commented Mar 26, 2024

Oh, its also important to note

        gevent.monkey.patch_all(thread=False, socket=False)

this was required, or the application would hang on startup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants