Enhanced Troubleshooting Capabilities for Request and Response Lifecycle #5052
Labels
NeedsDecision
Feedback is required from experts, contributors, and/or the community before a change can be made.
Feature Request
I've encountered situations where diagnosing issues related to requests and responses has proven challenging.
To enhance visibility, I propose the implementation of advanced troubleshooting features.
The optimal situation would be to have insights into the entire lifecycle of a request and response, including:
Original Client Request:
Provide visibility into the exact request sent by the client to envoy proxy.
Request Sent to Upstream Service:
Offer detailed information about the request forwarded by envoy to the upstream service.
This should encompass any modifications or additions made to the request by envoy.
Original Response returned from Upstream Service:
Enable access to the original and unmodified response generated by the upstream service.
Response Sent Back to the Client:
Facilitate insight into the response sent back to the client by envoy.
Providing visibility into each stage of the request-response lifecycle would reduce the time and effort required to diagnose problems.
Implementation
Proposing two options for implementation (maybe also a combination of both)
Tap Filter (see: Traffic Tapping) Already prepared a PR for this see here
Using tap filtering through admin endpoint will provide a highly flexible solution. The tap config (i.e. filters) can be configured during subscription time. Currently Tap Filters can only be used as envoy.filters.http and not as envoy.filters.http.upstream. That means with this feature we currently only have visibility to the "Request Sent to Upstream Service" and the "Response Sent Back to the Client". Maybe we can contribute "envoy.filters.http.upstream"-Tap Filter as a feature to envoy?
AccessLogs
Currently the Request Sent to Upstream Service and the Response Sent Back to the Client is logged as
{"service":"envoy","message":"http-request", ... }
with configurable amount of additional details.If we are setting
AccessLogOptions.FlushAccessLogOnNewRequest: true
onHttpConnectionManager
then we get additionally a log that represents the Original Client Request. If we are setting the
UpstreamLog
on theenvoy.filters.http.router
with the optionFlushUpstreamLogOnUpstreamStream: true
then we are getting also access logs for Request Sent to Upstream Service and Original Response returned from Upstream Service. We can filter these AccessLogTypes already in envoy config, so that no unnecessary logs are sent to the pomerium grpc service (If you didn´t opt-in to that troubleshooting feature flag / config option).The text was updated successfully, but these errors were encountered: