Skip to content

Commit

Permalink
monitoring: create prometheus rules with helm chart
Browse files Browse the repository at this point in the history
The prometheus rules had been previously created if the cephcluster CR
setting monitoring.enabled was set to true. The rules were not customizable
and therefore not flexible enough. Now the rules are installed by the helm
chart. To customize the rules, a post-processor can be applied to the helm
chart.

Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
  • Loading branch information
travisn committed Mar 9, 2022
1 parent 235781e commit 9469a77
Show file tree
Hide file tree
Showing 23 changed files with 933 additions and 497 deletions.
2 changes: 2 additions & 0 deletions PendingReleaseNotes.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@
## Breaking Changes

* The mds liveness and startup probes are now configured by the filesystem CR instead of the cluster CR. To apply the mds probes, they need to be specified in the filesystem CR. See the [filesystem CR doc](Documentation/ceph-filesystem-crd.md#metadata-server-settings) for more details. See #9550
* Prometheus rules are installed by the helm chart. If you were relying on the cephcluster setting `monitoring.enabled` to create the prometheus rules, they instead need to be enabled by setting `monitoring.createPrometheusRules` in the helm chart values.

## Features

* The number of mgr daemons for example clusters is increased to 2 from 1, resulting in a standby mgr daemon.
If the active mgr goes down, Ceph will update the passive mgr to be active, and rook will update all the services
with the label app=rook-ceph-mgr to direct traffic to the new active mgr.
* Add support for custom ceph.conf for csi pods. See #9567
* Added and updated many Ceph prometheus rules, picked up from the ceph repo
Original file line number Diff line number Diff line change
@@ -1,13 +1,4 @@
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: rook-prometheus
role: alert-rules
name: prometheus-ceph-rules
namespace: rook-ceph
spec:
groups:
groups:
- name: persistent-volume-alert.rules
rules:
- alert: PersistentVolumeUsageNearFull
Expand All @@ -32,4 +23,3 @@ spec:
for: 5s
labels:
severity: critical

890 changes: 890 additions & 0 deletions deploy/charts/rook-ceph-cluster/prometheus/localrules.yaml

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion deploy/charts/rook-ceph-cluster/templates/cephcluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ metadata:
name: {{ default .Release.Namespace .Values.clusterName }}
spec:
monitoring:
rulesNamespace: {{ default .Release.Namespace .Values.monitoring.rulesNamespaceOverride }}
{{ toYaml .Values.monitoring | indent 4 }}

{{ toYaml .Values.cephClusterSpec | indent 2 }}
23 changes: 23 additions & 0 deletions deploy/charts/rook-ceph-cluster/templates/prometheusrules.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{{- if .Values.monitoring.createPrometheusRules }}
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: rook-prometheus
role: alert-rules
name: prometheus-ceph-rules
namespace: {{ default .Release.Namespace .Values.monitoring.rulesNamespaceOverride }}
spec:
# Import the raw prometheus rules since they have descriptions that should not be processed with the helm templates
{{- $root := . }}
{{- if .Values.cephClusterSpec.external.enable }}
{{- range $path, $bytes := .Files.Glob "prometheus/externalrules.yaml" }}
{{ $root.Files.Get $path }}
{{- end }}
{{- else }}
{{- range $path, $bytes := .Files.Glob "prometheus/localrules.yaml" }}
{{ $root.Files.Get $path }}
{{- end }}
{{- end }}
{{- end }}
5 changes: 5 additions & 0 deletions deploy/charts/rook-ceph-cluster/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,12 @@ monitoring:
# requires Prometheus to be pre-installed
# enabling will also create RBAC rules to allow Operator to create ServiceMonitors
enabled: false
# the namespace in which to create the prometheus rules, if different from the rook cluster namespace
# If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus
# deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions.
rulesNamespaceOverride:
# whether to create the prometheus rules, first requires a separate install of Prometheus
createPrometheusRules: false

# If true, create & use PSP resources. Set this to the same value as the rook-ceph chart.
pspEnable: true
Expand Down
3 changes: 0 additions & 3 deletions deploy/charts/rook-ceph/templates/resources.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1645,9 +1645,6 @@ spec:
maximum: 65535
minimum: 0
type: integer
rulesNamespace:
description: RulesNamespace is the namespace where the prometheus rules and alerts should be created. If empty, the same namespace as the cluster will be used.
type: string
type: object
network:
description: Network related configuration
Expand Down
6 changes: 0 additions & 6 deletions deploy/examples/cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -74,12 +74,6 @@ spec:
monitoring:
# requires Prometheus to be pre-installed
enabled: false
# namespace to deploy prometheusRule in. If empty, namespace of the cluster will be used.
# Recommended:
# If you have a single rook-ceph cluster, set the rulesNamespace to the same namespace as the cluster or keep it empty.
# If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus
# deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions.
rulesNamespace: rook-ceph
network:
# enable host networking
#provider: host
Expand Down
3 changes: 0 additions & 3 deletions deploy/examples/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1644,9 +1644,6 @@ spec:
maximum: 65535
minimum: 0
type: integer
rulesNamespace:
description: RulesNamespace is the namespace where the prometheus rules and alerts should be created. If empty, the same namespace as the cluster will be used.
type: string
type: object
network:
description: Network related configuration
Expand Down

0 comments on commit 9469a77

Please sign in to comment.