Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logs: Allow duplicate keys #3931

Closed
pellared opened this issue Mar 8, 2024 · 41 comments · Fixed by #3987
Closed

logs: Allow duplicate keys #3931

pellared opened this issue Mar 8, 2024 · 41 comments · Fixed by #3987
Labels
area:data-model For issues related to data model spec:logs Related to the specification/logs directory triage:deciding:community-feedback

Comments

@pellared
Copy link
Member

pellared commented Mar 8, 2024

What

https://github.com/open-telemetry/opentelemetry-specification/blob/e8d5e1416a6a491f0fc26df60a5a2bd41535a8ca/specification/logs/data-model.md#type-mapstring-any should allow having duplicate keys.

Why

Performance

Checking for duplicates introduces performance overhead which can be undesirable for systems where logging should be high-performant. Changing the data model definition would allow postponing the deduplication to the exporters which require it (e.g. OTLP). Other exporters (e.g. console exporter) can use the key-values as provided.

Better bridging of existing logging libraries

Some logging libraries allow duplicate keys. The change is needed for faithfully bridging existing logging systems.

For instance slog (the structured logging package from Go the standard library) allows it. See: https://go.dev/play/p/fzIswL0Le7j.

@pellared
Copy link
Member Author

pellared commented Mar 11, 2024

This would could to be changed to something like:

Type []keyval<string, any>

Value of type []keyval<string, any> is a collection of key-value pairs with string keys to any values.

Arbitrary deep nesting of values for arrays and maps is allowed (essentially
allows to represent an equivalent of a JSON object).

Such change can be considered breaking as it changes the behavior.

However, I believe it can be acceptable given it is needed to correctly bridge existing logging systems. Maybe it could be left to the SDK depending on what is more idiomatic in given language.

@pellared
Copy link
Member Author

PTAL @open-telemetry/specs-logs-approvers

@arminru arminru added spec:logs Related to the specification/logs directory area:data-model For issues related to data model labels Mar 12, 2024
@cijothomas
Copy link
Member

OTel Rust Logging chose to not-do-dedup for perf reasons! Thanks for raising this.

(Similar for metrics, traces too, but will save it for another day)

@jack-berg
Copy link
Member

Discussed in the 3/12/24 spec SIG that evolving the log data model attributes representation from being a map<string, any> (i.e. no duplicate keys) to []keyval<string, any> (i.e. duplicate keys allowed) seems to be an allowable change because its loosening the restriction.

We did find language in the proto that states that keys in the KeyValueList type MUST be unique. Not sure how we could change that "MUST" without it being considered breaking to consumers.

@pellared
Copy link
Member Author

pellared commented Mar 12, 2024

Note: This is also about log record body representation (not only log attributes).

@lalitb
Copy link
Member

lalitb commented Mar 12, 2024

otel-cpp decided against keys de-duplication to prioritize performance, under the assumption that backend/platform-specific de-duplication needs would be addressed within their exporters.

@pellared
Copy link
Member Author

We did find language in the proto that states that keys in the KeyValueList type MUST be unique. Not sure how we could change that "MUST" without it being considered breaking to consumers.

I opened open-telemetry/opentelemetry-proto#533

@mx-psi
Copy link
Member

mx-psi commented Mar 12, 2024

As mentioned here I can't think of a good backwards-compatible solution to address this change on pdata: open-telemetry/opentelemetry-proto#533 (comment) we implicitly assume keys are unique in our API

@MrAlias
Copy link
Contributor

MrAlias commented Mar 12, 2024

@mx-psi how does the collector handle the logs from C++ given what @lalitb said?

@jsuereth
Copy link
Contributor

This would be a huge breaking change for OTLP proto handling. Right now, we explicitly disallow duplicates and allow implementations to treat the list as a map<string, AnyValue>.

The explicit reason for not choosing map... I beleive had to do with its newness and unavailability in tooling (GoGoProto). I think if we were to re-evaluate that decision today, I'd personally have pushed hard for map.

If Logging needs multiple values per-key, the OTLP they should use is a single key with value of List<.. value type..>. If they wish to define the sdk such that they can build this up with multiple values specified individually, fine.

@pellared
Copy link
Member Author

pellared commented Mar 12, 2024

@jsuereth The point is that doing anything like

If Logging needs multiple values per-key, the OTLP they should use is a single key with value of List<.. value type..>. If they wish to define the sdk such that they can build this up with multiple values specified individually, fine.

would also increase the performance overhead.

See:

@jsuereth
Copy link
Contributor

I understand, but someone has to pay that cost at some point. Right now this would be a large breaking change for OTLP, in my opinion unacceptably large.

@pellared
Copy link
Member Author

pellared commented Mar 12, 2024

I understand, but someone has to pay that cost at some point. Right now this would be a large breaking change for OTLP, in my opinion unacceptably large.

If I understand correctly, nothing would have to be changed in the collector. See open-telemetry/opentelemetry-proto#533 (comment)

@mx-psi
Copy link
Member

mx-psi commented Mar 12, 2024

@mx-psi how does the collector handle the logs from C++ given what @lalitb said?

@MrAlias The behavior is unspecified in this situation, since it is not allowed by the current wording of the spec. I can summarize what the current behavior is, but there are no guarantees that it will keep working the same way in the future since it is an invalid situation

@djaglowski
Copy link
Member

Both OTLP and JSON allows passing such data.

While JSON technically allows it, it also says: When the names within an object are not unique, the behavior of software that receives such an object is unpredictable. I think this is a decent summary of the notion of "duplicate keys".

I understand why logging libraries would optimize for performance, but how much value does it actually contribute towards describing a log in a semantically meaningful way? I think at best we should consider this "deferred deduplication", and clearly apply the similar disclaimer as the JSON spec.

nothing would have to be changed in the collector

It depends whether we need to support duplicate keys throughout the collector, or just be able to handle deduplication during ingest. If the former, we would basically have to rework everything from the syntax used to reference values within structured data to the optimization strategies for parsing and transposing data, because it's all built on the assumption that keys are unique.

@pellared
Copy link
Member Author

pellared commented Mar 12, 2024

just be able to handle deduplication during ingest.

This is good to me.

PS. I did my best to update open-telemetry/opentelemetry-proto#533 to reflect it. Also see: open-telemetry/opentelemetry-proto#533 (comment)

EDIT:

clearly apply the similar disclaimer as the JSON spec.

This is already in the PR.

@pellared
Copy link
Member Author

.NET represents logs attributes as key-value pairs.

Related issue: open-telemetry/opentelemetry-dotnet#4324

@andrzej-stencel
Copy link
Member

Discussed in the 3/12/24 spec SIG that evolving the log data model attributes representation from being a map<string, any> (i.e. no duplicate keys) to []keyval<string, any> (i.e. duplicate keys allowed) seems to be an allowable change because its loosening the restriction.

I have listened to the recording of the SIG meeting. I cannot understand how this change can be viewed as "allowed". It's obviously a breaking change to a stable spec that would affect any implementation of the spec and of the OTLP protocol.

The logs spec is clear: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.30.0/specification/logs/data-model.md#type-mapstring-any

The keys in the map are unique (duplicate keys are not allowed).

This change would affect not only the Otel collector, but any implementation of the OTLP protocol. For instance, Sumo Logic accepts OTLP and assumes that the keys are unique.

There's a reason the spec is stable. There was a time to decide this and this time has passed.

I surely understand the need for logging bridges to be performant and I'm all for preserving the data that slog or other libraries emit with duplicate keys. We should look at other ways to handle this. Jack's proposal during the SIG meeting of emitting one array field instead of multiple fields sounds reasonable to me.

@pellared
Copy link
Member Author

pellared commented Mar 13, 2024

Jack's proposal during the SIG meeting of emitting one array field instead of multiple fields sounds reasonable to me.

This would cause performance overhead as well.

Sumo Logic accepts OTLP and assumes that the keys are unique

What does it mean? The proto technically allows passing duplicates. Are you ignoring data that contain duplicated keys? Passing such data was always possible.

I propose to specify it as undefined behavior. The OTLP exporters can have a configuration to deduplicate key-val pairs before sending the data. (I added this proposal to the issue description).

@andrzej-stencel
Copy link
Member

What does it mean? The proto technically allows passing duplicates. Are you ignoring data that contain duplicated keys?

The behavior is unspecified, as this is currently not a valid payload 🙂

The fact that Sumo Logic or any other implementation behaves this way or another doesn't make it less of a breaking change. Given that the OTLP definition also explicitly states:

The keys MUST be unique (it is not allowed to have more than one value with the same key).

This means that it's acceptable for an implementation to assume the keys are unique and reject a payload with duplicate keys as invalid.

@pellared
Copy link
Member Author

Even if the change would be seen as not acceptable for Collector:

  • other exporters (e.g. console exporter) can take benefit from it
  • OTLP exporters could have a configuration to opt-out from deduplicating key-values with the same key (trade-off: performance over predictable handling of duplicated key-values)

@pellared
Copy link
Member Author

pellared commented Mar 13, 2024

This means that it's acceptable for an implementation to assume the keys are unique and reject a payload with duplicate keys as invalid.

I agree. I updated open-telemetry/opentelemetry-proto#533 to reflect this.

@pellared
Copy link
Member Author

pellared commented Mar 13, 2024

I changed #3931 to draft as changes in OTLP are undesired.

Still, updating the definition of on model level can be helpful as it would postpone the deduplication logic to the exporters which require it (e.g. OTLP) where other can use the key-values as provided.

EDIT: I updated the issue description.

@pellared
Copy link
Member Author

I opened #3938

@MrAlias
Copy link
Contributor

MrAlias commented Mar 13, 2024

@open-telemetry/java-approvers @open-telemetry/php-approvers do either of your implementations de-duplicate keys?

@TylerHelmuth
Copy link
Member

I can't pretend to know the pain any language sigs are feeling from this restriction, but I want to caution us from making a decision based on our languages implementations alone. Since we expected other communities/vendors to implement OTLP and adhere to the Logs Data Model, this is a breaking change in my eyes regardless of how many OpenTelemetry Language SIGs have already chosen to not adhere to this specific restriction.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 13, 2024

I can't pretend to know the pain any language sigs are feeling from this restriction, but I want to caution us from making a decision based on our languages implementations alone. Since we expected other communities/vendors to implement OTLP and adhere to the Logs Data Model, this is a breaking change in my eyes regardless of how many OpenTelemetry Language SIGs have already chosen to not adhere to this specific restriction.

I mean, if every source of telemetry is already sending duplicate keys to vendors, then they are already handling duplicate keys. Whether intentional or not. By updating the proto to reflect reality it would better communicate to these vendors the actual state of the world.

@TylerHelmuth
Copy link
Member

then they are already handling duplicate keys

Depends on what you mean by handling. True support of duplicate keys feature would mean allowing both instances of the key to be retrievable/settable.

I can only speak for the Collector and Honeycomb, but for the collector only 1 instance of the key will be able to be interacted with and for Honeycomb only 1 instance of the key/value will be kept. So both instances handle it in the sense that they don't crash, but neither allow a user to actually take advantage of duplicate keys as a feature.

Since the spec/proto currently define uniqueness as a requirement I don't think the Collector and Honeycomb are alone in they way they handle any existing incompatibility with the specification.

Despite some existing implementations allowing duplicate keys, I don't like introducing a breaking change without a major version bump. I agree with @astencel-sumo that we attempt to solve this problem another way (although I recognize we're between a rock and a hard place).

@cijothomas
Copy link
Member

Given the amount of concerns raised (very valid ones!), it maybe better to find a balance - the OTel API (bridge)/SDK can avoid de-duplication logic, and that'd meet the perf goal. And the Exporter(s) can do the dedup (since they run in separate thread typically, this should be acceptable from perf standpoint) to comply with expectation of no-duplicate. The dedup logic could be to ignore duplicates (no guarantee which value ultimately wins, as it depends on implementation), or to make it list of values as suggested in earlier comments.

(OTel .NET was also planning to do this in the exporter thread anyway - open-telemetry/opentelemetry-dotnet#4324 . But I am not sure if it makes (or is already) OTel .NET uncompliant with the spec.)

@jsuereth
Copy link
Contributor

Given the amount of concerns raised (very valid ones!), it maybe better to find a balance - the OTel API (bridge)/SDK can avoid de-duplication logic, and that'd meet the perf goal. And the Exporter(s) can do the dedup (since they run in separate thread typically, this should be acceptable from perf standpoint) to comply with expectation of no-duplicate. The dedup logic could be to ignore duplicates (no guarantee which value ultimately wins, as it depends on implementation), or to make it list of values as suggested in earlier comments.

Yes - I think we should NOT update the specification to allow this. However if an API/SDK decided to move where de-duplciation logic happens to optimise performance, that's entirely reasonable.

What we should ensure (due to MUST in OTLP specification) is duplicate keys don't get sent over the wire / consumers of OTLP don't need to deal with the issue.

Depends on what you mean by handling. True support of duplicate keys feature would mean allowing both instances of the key to be retrievable/settable.

I wasn't even suggesting this extreme. I simply mean someone will have to merge or drop the duplicate key at some point.

I don't think the problem of potential duplicates is new to logs, but also exists for Spans.

Regarding Go slog: https://go.dev/play/p/fzIswL0Le7j - We could declare, as OpenTelemetry, that duplicated keys are unsupported. In fact, I'd be highly curious how any logging backend handles that scenario today. I'd also be curious what the friction would be asking users to change those duplicate-label scenarios to have different labels or write a combined value.

I guess another way of saying this: Sure, slog supports it as a possibility but do people actually do it, and if so, how widespread is it. If it's a very esoteric minority, I don't think we need to support it. If it's a huge number of Go applications, we should find a way to support it that doesn't break all OpenTelemetry protocol consumers.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 13, 2024

Regarding Go slog: https://go.dev/play/p/fzIswL0Le7j - We could declare, as OpenTelemetry, that duplicated keys are unsupported. In fact, I'd be highly curious how any logging backend handles that scenario today. I'd also be curious what the friction would be asking users to change those duplicate-label scenarios to have different labels or write a combined value.

I guess another way of saying this: Sure, slog supports it as a possibility but do people actually do it, and if so, how widespread is it. If it's a very esoteric minority, I don't think we need to support it. If it's a huge number of Go applications, we should find a way to support it that doesn't break all OpenTelemetry protocol consumers.

@jsuereth, as pointed out above, this is not limited to slog. There are other logging libraries in Go that support this and C++, Rust, and .NET also support this. It is unclear whether Java and PHP also support this. Inventory of all those sources is needed as well.

@pellared
Copy link
Member Author

pellared commented Mar 13, 2024

If I understand @jsuereth correctly, then #3938 should be inline with his comment.

API/SDK decided to move where de-duplciation logic happens to optimise performance, that's entirely reasonable.

What we should ensure (due to MUST in OTLP specification) is duplicate keys don't get sent over the wire / consumers of OTLP don't need to deal with the issue.

@brettmc
Copy link
Contributor

brettmc commented Mar 13, 2024

@open-telemetry/java-approvers @open-telemetry/php-approvers do either of your implementations de-duplicate keys?

@MrAlias PHP doesn't explicitly de-dup attribute keys or log body. Keys are implicitly de-duped because they're stored in an array which doesn't allow duplicates (later duplicates would clobber earlier).

@pellared
Copy link
Member Author

The current proposal is to allow SDKs to have accept a configuration that would disable key-value deduplication. This configuration option should have a clear warning that the caller is responsible for ensuring there are no duplicates, otherwise the behavior is unpredictable. This option should be used only if there is a need for high performance.

More: #3938 (comment).

@pellared
Copy link
Member Author

pellared commented Apr 9, 2024

I think it is fine to do the deduplication anywhere you want as long as externally observable data complies with this document.

I think we should answer these two questions to avoid possible confusion (and make clarifications in the specification):

  1. Can the SDKs have the key-value deduplication of log records as opt-in?
  2. Is the deduplication required for all log exporters or only OTLP log exporters?

For reference, I have 3 proposals how we can implement deduplication in Go Logs SDK:

  • Proposal A - OTLP exporters deduplicate key-values
    Then deduplication enabled by default for OTLP exporters.
  • Proposal B - SDK provides a processor decorator which deduplicates key-values
    Then deduplication is opt-in as a decorator (disabled by default).
  • Proposal C - simple and batch processor deduplicate key-values
    Then deduplication is enabled by default for all exporters; given they use the simple or batch processor provided by SDK.

Personally, I lean towards Proposal B (answers: Yes and No) as I prefer giving flexibility for the SIGs on how deduplication implemented so that it fits their SDKs' design and user expectations based on the experience with existing logging libraries. Minimal performance overhead is highly-demanded trait from OTel Go Logs. Allowing duplicates also follows the behavior of all popular Go logging libraries.

@MSNev
Copy link

MSNev commented Apr 9, 2024

As this "seems" to be talking about log based bridges, then isn't this a case where the logging bridge needs to perform the de-duplication and either "merge" or drop any duplicate key...

And I don't think we SHOULD support duplicate keys as not all runtimes have threads and some use a map to store the values in memory.

@pellared
Copy link
Member Author

pellared commented Apr 9, 2024

Based on the SIG meeting (feel free to edit this comment):

The SDKs should handle the key-value deduplication by default. It is acceptable to add an option to disable deduplication.

This is what the specification can say:

The SDK MUST by default pass log records to exporters with unique key-values.

The SDK MAY have an option to allow duplicated key-values
(e.g. for better performance or for receivers that accepts duplicated key-values).
If such option is provided, it MUST be documented that for many log record receivers,
handling of duplicated key-values is unpredictable.

Therefore, Proposal C (simple and batch processor deduplicate key-values) looks like the only acceptable solution (from the ones which where listed).

@cijothomas
Copy link
Member

Therefore, open-telemetry/opentelemetry-go#5086 (comment) (simple and batch processor deduplicate key-values) looks like the only acceptable solution (from the ones which where listed).

For folks who want to opt-out of de-duplication to gain perf, the expectation is they use their own custom export processor or simple/batch provides configuration to change its behavior?

@pellared
Copy link
Member Author

pellared commented Apr 9, 2024

For folks who want to opt-out of de-duplication to gain perf, the expectation is they use their own custom export processor or simple/batch provides configuration to change its behavior?

In my understanding simple/batch processor MAY provide configuration to change its behavior. The configuration documentation MUST say that for many log record receivers, handling of duplicated key-values is unpredictable.

EDIT: I just want to point out that this is one of possible implementations. I want to specify it in a way that allows some flexibility.

@joaopgrassi
Copy link
Member

@pellared In the PR discussion you mentioned:

Notice that it is not frequent that users add duplicated keys. Because of this, some languages may prefer to have it as opt-in

Is this based on some real-world usage data?

Anyway I'm in agreement with others that whatever we do, duplicated attributes must not leave SDKs, and even further, IMO in whatever exporter it may be, not just OTLP.

I think this is a no-issue and we don't need to change anything in the spec, as it already contains this (as @cijothomas mentioned)

The keys in each such collection are unique, i.e. there MUST NOT exist more than one key-value pair with the same key. The enforcement of uniqueness may be performed in a variety of ways as it best fits the limitations of the particular implementation.

https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/common#attribute-collections

--

To me, that means for Go, given your comment above that it's not a common thing to happen, having it as an opt-in feature makes sense.

@pellared
Copy link
Member Author

PR is ready to review: #3987

tigrannajaryan pushed a commit that referenced this issue Apr 26, 2024
…map as opt-in (#3987)

Fixes #3931

Per agreement: #3931 (comment)

> The SDKs should handle the key-value deduplication by default. It is acceptable to add an option to disable deduplication.

Previous PR: #3938


> I think it is fine to do the deduplication anywhere you want as long as externally observable data complies with this document.

The main purpose of this PR is to have an agreement for following questions (and update the specification to to make it more clear):
1. Is the deduplication required for all log exporters or only OTLP log exporters? Answer: It is required for all exporters.
2. Can the key-value deduplication for log records be opt-in? Answer: Yes, it is OK as long as it is documented that it can cause problems in case maps duplicated keys are exported.

Related to:
- open-telemetry/opentelemetry-go#5086
- open-telemetry/opentelemetry-dotnet#4324
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:data-model For issues related to data model spec:logs Related to the specification/logs directory triage:deciding:community-feedback
Projects
None yet