-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8: Add partial support for parsing UTF-8 metric and label names #13271
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good. Mostly nitpicking in the comment and requesting more test cases.
I don't really feel I'm the right person to review this because I am not familiar at all with the generated parser. I might also ask dumb questions in my comments.
Let's try to find a reviewer that knows those things better.
model/textparse/openmetricslex.l
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warning: I'm not familiar with the lexer syntax. I'll just rubberstamp this one if the results look OK to me. Experts, please chime in in time to stop me.
{"http.status",q="0.9",a="b"} 8.3835e-05 | ||
{"http.status",q="0.9",a="b"} 8.3835e-05 | ||
{q="0.9","http.status",a="b"} 8.3835e-05 | ||
{"go.gc_duration_seconds_sum"} 0.004304266` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a few test cases with "weirder" characters:
- Multibyte characters like "ö" or "€".
- Whitespace in the name.
- An escaped
"
in the name. {
and}
as part of the name.
Also, what about quoted label names? Including label names with all the weird characters above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oo, found some bugs, thanks :)
model/textparse/promlex.l
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same flag here as for the OpenMetrics lexer file: I don't really know the lexer syntax.
model/textparse/openmetricsparse.go
Outdated
@@ -476,3 +519,11 @@ func (p *OpenMetricsParser) getFloatValue(t token, after string) (float64, error | |||
} | |||
return val, nil | |||
} | |||
|
|||
func (p *OpenMetricsParser) verifyMetricName(start, end int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here as below for the corresponding method in promparse.go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently you removed verifyMetricName
from promparse.go, but left it in here. But this should go, too, shouldn't it?
I believe the Go tests should pass if you run |
fixed |
yes, another parser that will have to be fixed is the typescript one -- there is a task for that but I will create a proper issue to raise visibilitiy |
@@ -582,7 +582,11 @@ label_match_list: label_match_list COMMA label_matcher | |||
; | |||
|
|||
label_matcher : IDENTIFIER match_op STRING | |||
{ $$ = yylex.(*parser).newLabelMatcher($1, $2, $3); } | |||
{ $$ = yylex.(*parser).newLabelMatcher($1, $2, $3); } | |||
| string_identifier match_op STRING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need string_identifier match_op error
rule? or is the parser error useful without it already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would that be like
{"x"=}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok yes, adding that makes the error a little better
test failure is I think unrelated but I'll dig in |
Hm this test failure is a bad sign -- there have been multiple places where people are testing the value of the negotiated format against the string constants. This is Bad because there are now many different ways the formats can look and be equivalent. How do we prevent people from doing these bad string comparisons? make the string constants non-exported? |
fuzz error looks concerning, I'm trying to figure out how to run that test myself |
I was able to debug the fuzz error, fixing |
got it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now with fixed fuzzing and extended tests, as far as I can say.
See comments for minor things, otherwise I'd say this is good to go. Last chance for others to chime in!
go.mod
Outdated
@@ -52,7 +52,7 @@ require ( | |||
github.com/prometheus/alertmanager v0.26.0 | |||
github.com/prometheus/client_golang v1.18.0 | |||
github.com/prometheus/client_model v0.5.0 | |||
github.com/prometheus/common v0.46.0 | |||
github.com/prometheus/common v0.46.1-0.20240206181200-773d5664eb8d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not pin to specific commits here. I have just tagged 0.47.0 in common, so you can use it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the tag on github, do you need to push it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pushed it an hour ago, and even created a nice release.
Maybe Github is once more very eventually consistent?
https://github.com/prometheus/common/releases/tag/0.47.0
model/textparse/openmetricsparse.go
Outdated
@@ -476,3 +519,11 @@ func (p *OpenMetricsParser) getFloatValue(t token, after string) (float64, error | |||
} | |||
return val, nil | |||
} | |||
|
|||
func (p *OpenMetricsParser) verifyMetricName(start, end int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently you removed verifyMetricName
from promparse.go, but left it in here. But this should go, too, shouldn't it?
ah yes sorry |
Thank you. I think you can do the squashing and DCO'ing now, and then we can merge (barring any new objections from others). |
promql/promql_test.go
Outdated
func TestEvaluations(t *testing.T) { | ||
RunBuiltinTests(t, newTestEngine()) | ||
} | ||
// func TestEvaluations(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops
This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats. This grammar will also be valid for non-UTF-8 names. UTF-8 names will not be considered valid unless model.NameValidationScheme is changed. This does not update the go expfmt parser in text_parse.go, which will be addressed by prometheus/common#554. Part of prometheus#13095 Signed-off-by: Owen Williams <owen.williams@grafana.com>
🎉 |
This PR adds support for the new grammar of
{"metric_name", "l1"="val"}
to promql and some of the exposition formats. This grammar will also be valid for non-UTF-8 names. UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.This PR does not update the go expfmt parser in text_parse.go, which will be addressed by prometheus/common#554.
Correction, this PR does update the promql parser.
Part of #13095