v0.13.4 - Improved gRPC and external probes, HTTP latency breakdown and more
New Features / Enhancements
-
Latency Breakdown for HTTP Probes
Provide a way to report latency breakdown for HTTP probes (#699). -
gRPC Probe Enhancements
- Wait for successful connection before sending the request (#726, #729). By default gRPC's DialContext returns immediately even for the non-retryable errors. This mechanism doesn't work very well for probing -- there is no error message to show problems with the connection and connection is in the TRANSIENT_FAILURE state. We now use grpcurl's BlockingDial which takes care of these issues.
- Use client TLS by default for encryption. (#727)
- Capture request error messages better. (#731)
-
Streaming Metrics for external probes
Parse and export external probes' metrics as soon as they are available. (#708, #712, #713, #715, #716, #722)
This feature enables use cases where external probe runs less frequently (say 1 every 60s) but runs many tasks (for performance measurement, e.g.) and export results many times (say every 5s) within that interval. See discussion in #691 and #689 for more background. -
Bulk writes in postgres surfacer
Batch postgres surfacer writes to improve peformance (#717). -
Allow DNS Overrides
Allow overriding DNS server (#707).
Other Changes
- Add a command line flag to control prometheus metrics prefix:
--prometheus_metrics_prefix
(#732). - Fix ostgres surfacer metric filter (#711).
- [prober.saveconfig] Write probe configs to disk in order (#721).
- [tls] Reload server certificates as well (#719).
- [build.tools] Fix where python proto stub is copied to (#700).
- [build] Update net dependency for sec alert (#728).
- [build] Update protobuf package to fix security alert (#714).
Contributors
- @manugarg
- @markoposavec made their first contribution in #711. Thank you and welcome to the community!
- @sfc-gh-raram made their first contribution in #707. Thank you and welcome to the community!
Full Changelog: v0.13.3...v0.13.4