Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(llmobs): support in-code config for llmobs #9172

Merged
merged 48 commits into from May 17, 2024
Merged

Conversation

lievan
Copy link
Contributor

@lievan lievan commented May 6, 2024

Support in-code configuration for LLMObs users, to enable LLMObs and specify the following configurations that currently require environment variable configuration.

  • ml_app
  • list of integrations to patch (will patch all LLMObs integrations by default)
  • dd_llmobs_no_apm (turn off APM, telemetry, remote config, metrics)
  • DD site, DD env, DD service (will override config/env vars)
from ddtrace.llmobs import LLMObs

LLMObs.enable(
    ml_app="comms/langchain", 
    integrations=["openai"],
    llmobs_agentless_enabled=True,
    # api_key =...
    # site=...
    # env=...
    # service=...
    # _tracer=None
)

Allowing in-code setup also improves the dev experience for people tracing experimental apps with LLMObs. It also abstracts away a long list of environment variables non-APM customers are required to set to turn off all APM related features.

This PR should not break any previous way of setting up the Python SDK (e.g. using env vars and ddtrace-run).

Arguments passed to enable() should take precedence over environment variables, with the exception of DD_LLMOBS_ENABLED.

This PR also does a couple minor things:

  • If DD_LLMOBS_NO_APM env var is detected or configured through LLMObs.enable(), the OpenAI and LangChain integrations will disable submitting metrics unless the corresponding env vars DD_{OPENAI,LANGCHAIN}_METRICS_ENABLED is set to True.
  • We also automatically disable both telemetry writer and remote config pollers if DD_LLMOBS_NO_APM is detected or configured through LLMObs.enable().
  • We automatically patch the LLMObs integrations on LLMObs.enable().
  • Removes all LLMObs.enable() references in individual integration patch code (openai, botocore, langchain)

Note:

  • This change (only for LLMObs users) will override config.service, config.env if these are passed in to LLMObs.enable().
  • If a user runs via ddtrace-run, they cannot use LLMObs.enable() to configure their settings.

Checklist

  • Change(s) are motivated and described in the PR description
  • Testing strategy is described if automated tests are not included in the PR
  • Risks are described (performance impact, potential for breakage, maintainability)
  • Change is maintainable (easy to change, telemetry, documentation)
  • Library release note guidelines are followed or label changelog/no-changelog is set
  • Documentation is included (in-code, generated user docs, public corp docs)
  • Backport labels are set (if applicable)
  • If this PR changes the public interface, I've notified @DataDog/apm-tees.

Reviewer Checklist

  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Description motivates each change
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Change is maintainable (easy to change, telemetry, documentation)
  • Release note makes sense to a user of the library
  • Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

@datadog-dd-trace-py-rkomorn
Copy link

datadog-dd-trace-py-rkomorn bot commented May 6, 2024

Datadog Report

Branch report: evan.li/in-app-config
Commit report: edb3cd4
Test service: dd-trace-py

✅ 0 Failed, 405 Passed, 3511 Skipped, 20m 58.35s Total duration (1h 5m 38.25s time saved)

ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
@lievan lievan added the changelog/no-changelog A changelog entry is not required for this PR. label May 8, 2024
@lievan lievan marked this pull request as ready for review May 8, 2024 14:06
@lievan lievan requested a review from a team as a code owner May 8, 2024 14:06
@lievan lievan requested review from Yun-Kim and sabrenner May 8, 2024 14:07
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
Copy link
Contributor

@sabrenner sabrenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some non-blocking comments, but one important one about just verifying that we're OK with prioritizing environment variables over config options. This is on me, as I had it backwards when we discussed it last week. Node, for example, does prioritize config options over environment variables. I think the rationale is config options could always be set by environment variables, ie:

LLMObs.enable(
  ml_app=os.environ["DD_LLMOBS_APP_NAME"],
  # etc...
)

So grabbing the config option first could be seen as more relevant (depends on the developer practices of the end user too I guess).

I do think that what we currently have (env var over config) could be fine, but only as long as we're consistent with it. Otherwise, I think swapping the order of precedence could be considered a breaking change.

It's just difficult to interpret intention 😅

ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Yun-Kim Yun-Kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved
lievan and others added 3 commits May 16, 2024 15:35
Copy link
Contributor

@Yun-Kim Yun-Kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work!

@emmettbutler emmettbutler self-requested a review May 17, 2024 13:21
@lievan lievan enabled auto-merge (squash) May 17, 2024 13:58
@lievan lievan disabled auto-merge May 17, 2024 14:06
@lievan lievan enabled auto-merge (squash) May 17, 2024 16:33
@lievan lievan merged commit bf858f7 into main May 17, 2024
119 of 121 checks passed
@lievan lievan deleted the evan.li/in-app-config branch May 17, 2024 17:40
Copy link

The backport to 2.9 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.9 2.9
# Navigate to the new working tree
cd .worktrees/backport-2.9
# Create a new branch
git switch --create backport-9172-to-2.9
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 bf858f73768263ea7ed9743fff87d67d213afc11
# Push it to GitHub
git push --set-upstream origin backport-9172-to-2.9
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.9

Then, create a pull request where the base branch is 2.9 and the compare/head branch is backport-9172-to-2.9.

github-actions bot pushed a commit that referenced this pull request May 23, 2024
Support in-code configuration for LLMObs users, to enable LLMObs and
specify the following configurations that currently require environment
variable configuration.

- ml_app
- list of integrations to patch (will patch all LLMObs integrations by
default)
- dd_llmobs_no_apm (turn off APM, telemetry, remote config, metrics)
- DD site, DD env, DD service (will override config/env vars)

```
from ddtrace.llmobs import LLMObs

LLMObs.enable(
    ml_app="comms/langchain",
    integrations=["openai"],
    llmobs_agentless_enabled=True,
    # api_key =...
    # site=...
    # env=...
    # service=...
    # _tracer=None
)
```
Allowing in-code setup also improves the dev experience for people
tracing experimental apps with LLMObs. It also abstracts away a long
list of environment variables non-APM customers are required to set to
turn off all APM related features.

This PR should not break _any_ previous way of setting up the Python SDK
(e.g. using env vars and `ddtrace-run`).

Arguments passed to enable() should take precedence over environment
variables, with the exception of `DD_LLMOBS_ENABLED`.

This PR also does a couple minor things:
- If `DD_LLMOBS_NO_APM` env var is detected or configured through
LLMObs.enable(), the OpenAI and LangChain integrations will disable
submitting metrics unless the corresponding env vars
`DD_{OPENAI,LANGCHAIN}_METRICS_ENABLED` is set to True.
- We also automatically disable both telemetry writer and remote config
pollers if `DD_LLMOBS_NO_APM` is detected or configured through
LLMObs.enable().
- We automatically patch the LLMObs integrations on LLMObs.enable().
- Removes all LLMObs.enable() references in individual integration patch
code (openai, botocore, langchain)

Note:
- This change (only for LLMObs users) will override `config.service,
config.env` if these are passed in to `LLMObs.enable()`.
- If a user runs via `ddtrace-run`, they cannot use `LLMObs.enable()` to
configure their settings.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: lievan <evan.li@datadoqhq.com>
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
Co-authored-by: Yun Kim <yun.kim@datadoghq.com>
(cherry picked from commit bf858f7)
Yun-Kim pushed a commit that referenced this pull request May 23, 2024
Backport bf858f7 from #9172 to 2.9.

Support in-code configuration for LLMObs users, to enable LLMObs and
specify the following configurations that currently require environment
variable configuration.

- ml_app
- list of integrations to patch (will patch all LLMObs integrations by
default)
- dd_llmobs_no_apm (turn off APM, telemetry, remote config, metrics)
- DD site, DD env, DD service (will override config/env vars)

```
from ddtrace.llmobs import LLMObs

LLMObs.enable(
    ml_app="comms/langchain", 
    integrations=["openai"],
    llmobs_agentless_enabled=True,
    # api_key =...
    # site=...
    # env=...
    # service=...
    # _tracer=None
)
```
Allowing in-code setup also improves the dev experience for people
tracing experimental apps with LLMObs. It also abstracts away a long
list of environment variables non-APM customers are required to set to
turn off all APM related features.

This PR should not break _any_ previous way of setting up the Python SDK
(e.g. using env vars and `ddtrace-run`).

Arguments passed to enable() should take precedence over environment
variables, with the exception of `DD_LLMOBS_ENABLED`.

This PR also does a couple minor things:
- If `DD_LLMOBS_NO_APM` env var is detected or configured through
LLMObs.enable(), the OpenAI and LangChain integrations will disable
submitting metrics unless the corresponding env vars
`DD_{OPENAI,LANGCHAIN}_METRICS_ENABLED` is set to True.
- We also automatically disable both telemetry writer and remote config
pollers if `DD_LLMOBS_NO_APM` is detected or configured through
LLMObs.enable().
- We automatically patch the LLMObs integrations on LLMObs.enable().
- Removes all LLMObs.enable() references in individual integration patch
code (openai, botocore, langchain)

Note:
- This change (only for LLMObs users) will override `config.service,
config.env` if these are passed in to `LLMObs.enable()`.
- If a user runs via `ddtrace-run`, they cannot use `LLMObs.enable()` to
configure their settings.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: lievan <42917263+lievan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.9 changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants