Skip to content

richardelling/zpool_prometheus

Repository files navigation

Prometheus metrics for ZFS pools

The zpool_prometheus program produces prometheus-compatible metrics from zpools. In the UNIX tradition, zpool_prometheus does one thing: read statistics from a pool and print them to stdout. In many ways, this is a metrics-friendly output of statistics normally observed via the zpool command.

ZFS Versions

There are many implementations of ZFS on many OSes. The current version is tested to work on:

This should compile and run on other ZFS versions, though many do not have the latency histograms. Pull requests are welcome.

Metrics Categories

The following metric types are collected:

type description recurse? zpool equivalent
zpool_stats general size and data yes zpool list
zpool_scan_stats scrub, rebuild, and resilver statistics n/a zpool status
zpool_latency latency histograms for vdev yes zpool iostat -w
zpool_vdev per-vdev stats, currently queues no zpool iostat -q
zpool_req per-vdev request size stats yes zpool iostat -r

To be consistent with other prometheus collectors, each metric has HELP and TYPE comments.

Metric Names

Metric names are a mashup of:
<type as above>_<ZFS internal name>_<units>

For example, the pool's size metric is:
zpool_stats_size_bytes

Labels

The following labels are added to the metrics:

label metric description
name all pool name
state zpool_stats pool state, as shown by zpool status
state zpool_scan_stats scan state, as shown by zpool status
vdev zpool_stats, zpool_latency, zpool_vdev vdev name
path zpool_latency device path name, if available

vdev names

The vdev names represent the hierarchy of the pool configuration. The top of the pool is "root" and the pool configuration follows beneath. A slash '/' is used to separate the levels.

For example, a simple pool with a single disk can have a zpool status of:


	NAME       STATE     READ WRITE CKSUM
	testpool   ONLINE       0     0     0
	  sdb      ONLINE       0     0     0

where the internal vdev hierarchy is:

root
root/disk-0

A more complex pool can have logs and redundancy. For example:

	NAME         STATE     READ WRITE CKSUM
	testpool     ONLINE       0     0     0
	  sda        ONLINE       0     0     0
	  sdb        ONLINE       0     0     0
	special	
	  mirror-2   ONLINE       0     0     0
	    sdc      ONLINE       0     0     0
	    sde      ONLINE       0     0     0

were the internal vdev hierarchy is:

root
root/disk-0
root/disk-1
root/mirror-2
root/mirror-2/disk-0
root/mirror-2/disk-1

Note that the special device does not carry a special description. Log, cache, and spares are similarly not described in the hierarchy.

In some cases, the hierarchy can change over time. For example, if a vdev is removed, replaced, or attached then the hierarchy can grow or shrink as the vdevs come and go. Thus to determine the stats for a specific physical device, use the path

path names

When a vdev has an associated path, then the path's name is placed in the path value. For example:

path="/dev/sde1"

For brevity, the zpool status command often simplifies and truncates the path name. Also, the path name can change upon reboot. Care should be taken to properly match the path of the desired device when creating the pool or when querying in PromQL.

In an ideal world, the devid is a better direct method of uniquely identifying the device in Solaris-derived OSes. However, in Linux the devid is even less reliable than the path

Values

Currently, prometheus values must be type float64. This is unfortunate because many ZFS metrics are 64-bit unsigned ints. When the actual metric values exceed the significant size of the floats (52 bits) then the value resets. This prevents problems that occur due loss of resolution as the least significant bits are ignored during the conversion to float64.

Pro tip: use PromQL rate(), irate() or some sort of non-negative derivative (influxdb or graphite) for these counters.

Building

Building is simplified by using cmake. It is as simple as possible, but no simpler. By default, ZFSonLinux installs the necessary header and library files in /usr/local. If you place those files elsewhere, then edit CMakeLists.txt and change the CMAKE_INSTALL_PREFIX

# generic ZFSonLinux build
cmake .
make

For Ubuntu, versions 16+ include ZFS packages, but not all are installed by default. In particular, the required header files are in the libzfslinux-dev package. This changes the process slightly:

# Ubuntu 16+ build
apt install libzfslinux-dev
mv CMakeLists.ubuntu.txt CMakeLists.txt
cmake .
make 

If successful, the zpool_prometheus executable is created.

Using Containers

You can also build it using containers by running:

docker build -v "${PWD}":/zpool_prometheus -f Dockerfile.ubuntu .

The build files will be in the build.container directory. If compiled successfully you can find the zpool_prometheus executable in there.

Note that the zpool_prometheus executable must be built in an environment that matches the desired deployment environment (i.e. distro and version). You can adapt the given Dockerfile to match your environment. This can be as simple as changing the FROM ... line within distro families.

Installing

Installation is left as an exercise for the reader because there are many different methods that can be used. Ultimately the method depends on how the local metrics collection is implemented and the local access policies.

There are two basic methods known to work:

  1. Run a HTTP server that runs zpool_prometheus. A simple python+flask example server is included as serve_zpool_prometheus.py
  2. Run a scheduled (eg cron) job that redirects the output to a file that is subsequently read by node_exporter

Helpful comments in the source code are available.

To install the zpool_prometheus executable in CMAKE_INSTALL_PREFIX, use

make install

Caveat Emptor

  • Like the zpool command, zpool_prometheus takes a reader lock on spa_config for each imported pool. If this lock blocks, then the command will also block indefinitely and might be unkillable. This is not a normal condition, but can occur if there are bugs in the kernel modules. For this reason, care should be taken:

    • avoid spawning many of these commands hoping that one might finish
    • avoid frequent updates or short prometheus scrape time intervals, because the locks can interfere with the performance of other instances of zpool or zpool_prometheus
  • Metric values can overflow because the internal ZFS unsigned 64-bit int values do not transform to floats without loss of precision.

  • Histogram sum values are always zero. This is because ZFS does not record that data currently. For most histogram uses this isn't a problem, but be aware of prometheus histogram queries that expect a non-zero histogram sum.

Feedback Encouraged

Pull requests and issues are greatly appreciated. Visit https://github.com/richardelling/zpool_prometheus

About

A prometheus-style metrics scraper for ZFS pools

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published