Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to expose metrics endpoint of Patroni on the container #3813

Open
PaulVerhoeven1 opened this issue Dec 29, 2023 · 2 comments
Open

Option to expose metrics endpoint of Patroni on the container #3813

PaulVerhoeven1 opened this issue Dec 29, 2023 · 2 comments

Comments

@PaulVerhoeven1
Copy link

PaulVerhoeven1 commented Dec 29, 2023

Overview

Patroni has a port 8008 on localhost in the container, on this port you can find metrics on the /metrics endpoint and status of the working of the cluster on /cluster (just tested this inside the pod with curl -k https://localhost:8008/metrics). I don't see a way to enable the exposure of this port on the container.

Use Case

The endpoint https://127.0.0.1:8008/cluster gives an overview of the current status of the cluster, including the replication lag. We can use this endpoint to monitor if the cluster is healthy, because if the lag is to high and the master node goes down. patroni can't switchover to the replica with the high lag. A possible scenario where the whole cluster goes down. Based on that endpoint we can also create alerting rules if the lag is to high.

Besides that we can use the metrics endpoint for metrics and we can use an Grafana dashboards to give a clear insight on the status of Patroni. With example this dashboard: https://grafana.com/grafana/dashboards/18870-postgresql-patroni/.

Desired Behavior

A possiblity to enable the exposure of the port on the postgrescluster

Think about (or someting else):
postgrescluster.spec.patroni.enableport: true

or enable this port default in every Postgrescluster.

@dsessler7
Copy link
Contributor

Hello @PaulVerhoeven1!

I can create a feature request to have this port exposed in our backlog. Can you tell me a bit more about your use case?

We actually have alerting and trending of replication lag in our monitoring stack, so you are already covered there. Are there other metrics that these endpoints provide that you are particularly interested in?

@PaulVerhoeven1
Copy link
Author

I want to monitor the total current lag of patroni, As far as i can see in the replication lag in the monitoring stack doesn't show the total lag, i only saw spikes on times where there was a bit of lag in that dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants