You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Quantization of ElasticSearch urls does not match that of the Datadog agent and also does not work well. In some cases leading to un-useful resource names and high cardinality.
https://github.com/DataDog/dd-trace-py/blob/main/ddtrace/contrib/elasticsearch/quantize.py
Turns a url like "/en_search/_doc/fa1117d4-7917-5a2d-b001-76e6e4ca83b2"' into '"/en_search/_doc/fa?d4-?-5a2d-b?-?e6e4ca?b2"' before using it for the resource name. Ending up with resource names like '"GET /en_search/_doc/fa?d4-?-5a2d-b?-?e6e4ca?b2"' where instead we would want something like '/en_search/_doc/?'`
Currently we can work around this with a trace filter, however in the next major release (3.0) we should change the quantization to match the Datadog agent's which changes this into '/en_search/_doc/{guid}'` In the meantime, we could also put the new quantization behind a flag for now.
Trace filter example:
from ddtrace import Span, tracer
from ddtrace.filters import TraceFilter
# chops off after last "/" if there are ints after it and replaces it with "/?"
REMOVE_AFTER_LAST_SLASH = re.compile(r"/[^/]*\d[^/]*$")
class CorrectESResourceNameFilter(TraceFilter):
"""example input: '/en_search/_doc/fa?d4-?-5a2d-b?-?e6e4ca?b2' output: '/en_search/_doc/?'"""
def process_trace(self, trace):
# type: (List[Span]) -> Optional[List[Span]]
for span in trace:
# if you're changing the service name of elasticsearch, use that instead
if span.service == "elasticsearch":
url = span.get_tag("elasticsearch.url")
method = span.get_tag("elasticsearch.method")
span.resource = "{method} {url}".format(method=method, url=REMOVE_AFTER_LAST_SLASH.sub(r"/?", url))
return trace
# And then configure it with
tracer.configure(settings={'FILTERS': [CorrectESResourceNameFilter()]})
Which version of dd-trace-py are you using?
2.8.3
Elastic search
How can we reproduce your problem?
Try the current quantize method on `"/en_search/_doc/fa1117d4-7917-5a2d-b001-76e6e4ca83b2"'
What is the result that you get?
'"/en_search/_doc/fa?d4-?-5a2d-b?-?e6e4ca?b2"'
What is the result that you expected?
'/en_search/_doc/?'
The text was updated successfully, but these errors were encountered:
Summary of problem
Quantization of ElasticSearch urls does not match that of the Datadog agent and also does not work well. In some cases leading to un-useful resource names and high cardinality.
https://github.com/DataDog/dd-trace-py/blob/main/ddtrace/contrib/elasticsearch/quantize.py
Turns a url like
"/en_search/_doc/fa1117d4-7917-5a2d-b001-76e6e4ca83b2"' into '"/en_search/_doc/fa?d4-?-5a2d-b?-?e6e4ca?b2"' before using it for the resource name. Ending up with resource names like '"GET /en_search/_doc/fa?d4-?-5a2d-b?-?e6e4ca?b2"' where instead we would want something like
'/en_search/_doc/?'`Currently we can work around this with a trace filter, however in the next major release (3.0) we should change the quantization to match the Datadog agent's which changes this into '/en_search/_doc/{guid}'` In the meantime, we could also put the new quantization behind a flag for now.
Trace filter example:
Which version of dd-trace-py are you using?
2.8.3
Elastic search
How can we reproduce your problem?
Try the current quantize method on `"/en_search/_doc/fa1117d4-7917-5a2d-b001-76e6e4ca83b2"'
What is the result that you get?
'"/en_search/_doc/fa?d4-?-5a2d-b?-?e6e4ca?b2"'
What is the result that you expected?
'/en_search/_doc/?'
The text was updated successfully, but these errors were encountered: