Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baggage: Accept non-ASCII keys #4946

Open
pellared opened this issue Feb 19, 2024 · 8 comments · May be fixed by #5132
Open

baggage: Accept non-ASCII keys #4946

pellared opened this issue Feb 19, 2024 · 8 comments · May be fixed by #5132
Assignees
Labels
area:baggage Part of OpenTelemetry baggage area:propagators Part of OpenTelemetry context propagation

Comments

@pellared
Copy link
Member

pellared commented Feb 19, 2024

Per open-telemetry/opentelemetry-specification#3801

The new wording makes Go implementation non-compliant, but it's not a breaking change for its API, you will just need to relax the checks (i.e. never return errors from baggage.New)

Originally posted by @yurishkuro in open-telemetry/opentelemetry-specification#3801 (comment)

@pellared pellared added blocked:specification Waiting on clarification of the OpenTelemetry specification before progress can be made area:baggage Part of OpenTelemetry baggage area:propagators Part of OpenTelemetry context propagation and removed blocked:specification Waiting on clarification of the OpenTelemetry specification before progress can be made labels Feb 19, 2024
@XSAM XSAM self-assigned this Mar 15, 2024
@XSAM
Copy link
Member

XSAM commented Mar 25, 2024

I need help here.

The blank space should be considered as a valid UTF-8 string, but it also means would be a valid key.

p, err := NewKeyProperty(" ")
assert.NoError(t, err)

The following case also became available that ąść is a valid key that will be encoded as %C4%85%C5%9B%C4%87.

		{
			name: "encoded UTF-8 string in key",
			in:   "%C4%85%C5%9B%C4%87=%C4%85%C5%9B%C4%87",
			want: baggage.List{
				"ąść": {Value: "ąść"},
			},
		},

Do you think these cases should be considered valid cases? @pellared

@pellared
Copy link
Member Author

p, err := NewKeyProperty(" ")
assert.NoError(t, err)

This is a valid correct test as it touches the Baggage API (not propagator).

The following case also became available that ąść is a valid key that will be encoded as %C4%85%C5%9B%C4%87.

Right now, the specification for W3C baggage propagator does NOT support percent encoding of the key. From https://www.w3.org/TR/baggage/#key:

3.2.1.2 key
A token which identifies a value in the baggage. token is defined in RFC7230, Section 3.2.6.

func (b Baggage) String() string should skip encoding of the list members and properties which contain invalid keys (according to https://www.w3.org/TR/baggage/#key).

Alternatively, the W3C baggage specification could allow percent encoding of the keys. CC @yurishkuro @dyladan

@dyladan
Copy link
Member

dyladan commented Mar 25, 2024

Percent encoding of keys is something that has been considered and rejected by W3C. Worth noting that there is nothing stopping an implementation from doing it; all the required characters are allowed. If the w3c specification added it though, it creates a lot of questions. For example, if you have an entry for key1 and an entry for key%31 is this a duplicate entry? It forces additional processing requirements on receivers that we prefer to avoid.

@pellared
Copy link
Member Author

Worth noting that there is nothing stopping an implementation from doing it; all the required characters are allowed.

I think it is better (safer) not doing as the implementations (e.g. in other languages) may not percent-decode the keys. This may lead to unpredictable behavior. @dyladan, does it make sense?

@dyladan
Copy link
Member

dyladan commented Mar 25, 2024

I didn't mean go should do it. I meant the otel spec could do it if desired then all languages would do it consistently

@XSAM
Copy link
Member

XSAM commented Mar 25, 2024

From baggage spec:

Baggage names are any valid UTF-8 strings. Language API SHOULD NOT restrict which strings are used as baggage names. However, the specific Propagators that are used to transmit baggage entries across component boundaries may impose their own restrictions on baggage names. For example, the W3C Baggage specification restricts the baggage keys to strings that satisfy the token definition from RFC7230, Section 3.2.6. For maximum compatibility, alpha-numeric names are strongly recommended to be used as baggage names.

It mentions the baggage could accept any valid UTF-8 as a key but not for the W3C baggage propagator. So, I guess we can implement UTF-8 in baggage API but drop the key with UTF-8 in the W3C baggage propagator.

@yurishkuro
Copy link
Member

The spec allows any UTF8 strings as baggage keys, case sensitive. It already recommends these unit tests for Baggage API:

baggage.Set('a', 'B% 💼');
baggage.Set('A', 'c');
baggage.Get('a'); // returns "B% 💼"
baggage.Get('A'); // returns "c"

I would add two more tests where the key itself contains an emoji or the key is a space - both are valid.

For the W3C propagator, there is already a written compatibility test: https://github.com/w3c/baggage/blob/main/test/test_baggage.py. One thing that's seems missing there is the handling of malformed inputs (I booked w3c/baggage#131)

@pellared
Copy link
Member Author

pellared commented Mar 27, 2024

I didn't mean go should do it. I meant the otel spec could do it if desired then all languages would do it consistently

I do not think it should be done especially that % is an valid value in the token. See: https://datatracker.ietf.org/doc/html/rfc7230#section-3.2.6

@XSAM XSAM linked a pull request Apr 2, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:baggage Part of OpenTelemetry baggage area:propagators Part of OpenTelemetry context propagation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants