Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional ZFS collector errors on kstat.zfs.misc.arcstats.memory_available_bytes #2766

Open
siebenmann opened this issue Aug 1, 2023 · 5 comments · May be fixed by #2961
Open

Occasional ZFS collector errors on kstat.zfs.misc.arcstats.memory_available_bytes #2766

siebenmann opened this issue Aug 1, 2023 · 5 comments · May be fixed by #2961

Comments

@siebenmann
Copy link

Periodically I see 'cannot parse expected integer value' errors logged for the ZFS collector trying to collect memory_available_bytes, which comes from /proc/spl/kstat/zfs/arcstat. Support for this was added to deal with #2656, because this kstat is a type 3 (signed int64) instead of the more usual type 4 (unsigned int64).

Looking at the code involved, I believe this is most likely happening when memory_available_bytes goes negative, which it can; this possibility is why it's a signed int64 kstat instead of an unsigned in64. The current code in collector/zfs_linux.go parses both kstatDataUint64 and kstatDataInt64 data with strconv.ParseUint(), which is not going to accept negative numbers.

As a small RFE, it would be nice if this specific error message logged either or both of the specific error from ParseUint() and the actual value that failed to parse. As it is, I'm guessing at the actual problem involved from my knowledge of the range of ``memory_available_bytes` and the code.

Host operating system: output of uname -a

Linux hawkwind.cs 6.4.6-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Jul 24 20:51:12 UTC 2023 x86_64 GNU/Linux

node_exporter version: output of node_exporter --version

node_exporter, version 1.6.1 (branch: HEAD, revision: 4a1b77600c1873a8233f3ffb55afcedbb63b8d84)

node_exporter log output

ts=2023-08-01T14:58:27.022Z caller=collector.go:169 level=error msg="collector failed" name=zfs duration_seconds=0.098266305 err="could not parse expected integer value for \"kstat.zfs.misc.arcstats.memory_available_bytes\""

Are you running node_exporter in Docker?

No.

@dswarbrick
Copy link
Contributor

The uint64 types seem to run fairly deep in zfs_linux.go. The parseProcfsFile function, which is used to parse other files under /proc/spl/zfs, expects to be passed a handler func(zfsSysctl, uint64). It seems that it was designed from the ground up to only support unsigned ints.

The obvious solution would be to change the handler function signature to func(zfsSysctl, float64) so that it can handle both positive and negative values. I don't think we would lose anything by casting to float64 early.

The TestArcstatsParsing looks pretty superficial, and actually only verifies the value of kstat.zfs.misc.arcstats.hits. This looks like it could be made a bit more thorough. The test fixtures look outdated too, as they don't even have a line for memory_available_bytes (or a few other memory_* items).

@discordianfish
Copy link
Member

Why float64 not int64?

@dswarbrick
Copy link
Contributor

dswarbrick commented Aug 8, 2023

Why float64 not int64?

Because the kernel uses uint64_t for some values, which would potentially overflow an int64.

rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
@rexagod
Copy link
Contributor

rexagod commented Mar 18, 2024

float64 is only accurate for integers constrained within it's mantissa range (52-bit). Expressing an integer that goes beyond (2^52, or 2^53 if we consider the implicit additional bit) will lead to loss of precision (rounding off).

rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
@rexagod rexagod linked a pull request Mar 18, 2024 that will close this issue
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
@dswarbrick
Copy link
Contributor

float64 is only accurate for integers constrained within it's mantissa range (52-bit). Expressing an integer that goes beyond (2^52, or 2^53 if we consider the implicit additional bit) will lead to loss of precision (rounding off).

This is true, and it is why snmp_exporter wraps Counter64 values at the float64 mantissa 2^53. Perhaps this should be a general approach for all 64-bit counters.

rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 18, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 19, 2024
Prevent integer underflow when parsing the `procfs` file as it used a
`ParseUint` to parse signed values.

Fixes: prometheus#2766
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants