Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up code relating to scope/event transaction value #5718

Closed
lobsterkatie opened this issue Sep 12, 2022 · 0 comments
Closed

Clean up code relating to scope/event transaction value #5718

lobsterkatie opened this issue Sep 12, 2022 · 0 comments
Milestone

Comments

@lobsterkatie
Copy link
Member

lobsterkatie commented Sep 12, 2022

Long ago, Sentry events had the idea of a culprit, which was a string approximately meaning, "the context in which this error happened." A number of years ago, a(n only-partially-successful) attempt was made to replace culprit with transaction, which remained a string with the same basic meaning. Then we got into the APM business, and began to use the word transaction to refer to the root span of a tracing span tree, while still retaining it in some places as a string meaning "the name of the active (tracing-type) transaction," or keeping its original meaning if tracing isn't enabled. This caused confusion, so we changed some APIs to refer to the string as transactionName, so that transaction could mean "root span object." We've also made various changes throughout the years (both recently and in the more distant past) to improve our ability to figure out the right transaction value.

So where are we now? Kind of in a mess, unfortunately.

  • Scope.setTransactionName and Scope.getTransaction.setName change two different, independent pieces of data.

  • In the requestData module in @sentry/utils we have extractTransaction and extractPathForTransaction, which do slightly different things and which are called by both addRequestDataToEvent (called for both errors and transactions) and addRequestDataToTransaction (called only for transactions, but not in every SDK). extractPathForTransaction is also called directly by other parts of the SDK, at various points of the transaction lifecycle. This makes it almost prohibitively hard to make changes, because reasoning about the effects of those changes across all of the various code paths is so difficult.

  • An event's transaction value can come from at least six different places: event creation, from the scope, from the addRequestDataToEvent function called by event processors in various SDKs, from the addRequestDataToTransaction function called by our Express tracing handler, from an error event's stackframes, and soon from the new RequestData integration. Which (and how many) of these various options apply depends on the type of the event, which SDK is running, the settings passed to our Express request handler, and possibly other things I'm forgetting. This means, among other things, that an error event and its associated transaction event might have different transaction values. (See screenshot below.)

  • Events also have a transaction tag, which comes exclusively from the name of the active Transaction object, but which gets overwritten by the transaction value during ingest, regardless of whether it's set or what it's set to.

  • We still have a Transaction integration dating from the pre-APM-attempt-to-replace-culprit-with-transaction days, which uses stackframes to find an event's transaction (with the result that people end up with minified function names as their transaction values).

Before v8, let's take a hard look at this.

Related issues:
#5768
#5660


Example of second point above:

image

@AbhiPrasad AbhiPrasad added this to the 8.0.0 milestone Sep 12, 2022
lobsterkatie added a commit that referenced this issue Sep 19, 2022
#5703)

In most of our Node-based SDKs, we use domains to prevent scope bleed between requests, running the entire wrapped event-handling process inside a single domain. This works because in those cases, there is only one request-handling entry point for us to wrap, so we can stick the entire thing in a single `domain.run()` call and (more or less) rely on the fact that we've got a single, consistent, unique-to-that-request `Scope` object that we're dealing with. We've followed this pattern in the nextjs SDK as well, both in `instrumentServer` and `withSentry`. The new auto-wrapping of data-fetching functions presents a challenge, though, because a single request may involve a number of our wrapped functions running at different points in the request handling process, with no shared (little S) scope between them. While we still use domains when wrapping the data fetchers, a single request may pass through multiple domains during its lifecycle, preventing us from using any given domain as a universal `Scope`-carrier for a single `Scope` instance.

One place where this becomes is problem is our use of `addRequestDataToEvent`, which up until now we've just been calling inside a single-request event processor added to the current `Scope`. In this system, each event processor holds a reference to its particular `req` object. In the case of the data-fetchers, however, the `Scope` instance to which we might add such an event processor isn't the same as the one which will be active when the event is being processed, so the current way doesn't work. But given that the only way in which our current, single-request-specific event processors differ is by the request to which they hold a reference, they can be replaced by a single, and universal, event processor, as long as we can access `req` a different way besides keeping it in a closure as we do now.

This PR does just that (that is, switches to using a single event processor) for transaction events. First, a reference to `req` is stored in the transaction's metadata (which is then available to event processors as `sdkProcessingMetadata`). Then a new default integration, `RequestData`, pulls `req` out of the metadata and uses it to add a `request` property to the event. 

Notes:

- The options API for the new integration is inspired by, but different from, the options API for our Express request handler. (When we work on cleaning up the request data utility functions as part of fixing #5718, we might consider bringing those options in line with these.) The primary differences are:
  - The options have been (almost entirely) flattened. (It never made sense that inside of options for request data, you had a `request` key holding a subset of the options.) Now everything is at the same level and only takes a boolean. The one exception is `user`, which can still take a boolean or a list of attributes.
  - In the handler options, `transaction` can either be a boolean or a `TransactionNamingScheme`. In the integration, it can no longer be a boolean - events are going to have transactions, one way or another, so we shouldn't let people set it to `false`. Further, since it's now not about whether `transaction` is or isn't included, it's been moved out of `include` into its own `transactionNamingScheme` option. (Note that the new option is not yet used, but will be in future PRs.)
  - `method` has also been moved out of include, and has been hardcoded to `true`, since it's an integral part of naming the request. We currently include it in the transaction name, regardless of the setting, so again here, letting people set it to `false` makes no sense.

- Though `req` has been added to transaction metadata everywhere we start transactions, the existing domain/`Scope`-based event processors haven't yet been removed, because this new method only works for transactions, not errors. (Solving that will be the subject of a future PR.) The existing processors have been modified to not apply to transaction events, however.

- Though we may at some point use the `RequestData` integration added here in browser-based SDKs as well, both in this PR and in near-term future PRs it will only be used in a Node context. It's therefore been added to `@sentry/node`, to prevent a repeat of the dependency injection mess [we just undid](#5759).

- No integration tests were added specifically for this change, because the existing integration tests already test that transaction events include a `request` property, so as long as they continue to pass, we know that using the integration instead of an event processor is working.

Ref: #5505
lobsterkatie added a commit that referenced this issue Sep 21, 2022
In #5703, a new integration, `RequestData`, was added to take the place of the request-specific event processors we've been using to add request data to transaction events in the nextjs SDK. This builds on that work by making the same switch for error events.

Notes:

- Because of #5718, it's hard to reason about the potential side effects of making major changes to the logic in `@sentry/utils/requestData`. Therefore, the majority of the new logic in this PR has been added to the integration, and just overwrites the `transaction` value added by the functions in `requestData`. Once we've cleaned up the request data code, we can consolidate the logic.
@HazAT HazAT closed this as completed Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants