Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collector process failed - counter not found on Windows Server 2022 #1313

Closed
Mario-Hofstaetter opened this issue Oct 23, 2023 · 13 comments · Fixed by #1477
Closed

collector process failed - counter not found on Windows Server 2022 #1313

Mario-Hofstaetter opened this issue Oct 23, 2023 · 13 comments · Fixed by #1477

Comments

@Mario-Hofstaetter
Copy link
Contributor

Mario-Hofstaetter commented Oct 23, 2023

Using version 0.24.0 on a new Windows Server 2022 machine, the exporter service was not starting.
Investigating logs this error is found:

... caller=prometheus.go:188 level=error msg="collector process failed after 0.000000s" err="counter not found"
... caller=prometheus.go:188 level=error msg="collector process failed after 0.000000s" err="counter not found"

Related issue found:

I tried running

lodctr /R

as originally posted by @carlpett in #580 (comment)

But that did not resolve the issue (I have not yet tried rebooting). We use these collectors by default:

enabled: "cpu,cs,logical_disk,memory,net,os,process,service,system,tcp,textfile,time"

Enabling them one by one, only process seems to have an issue.

Looking into perfmon.exe I noticed the following:

On the machine with the error, process performance counters are listed under category Process V2.
An ok machine (Windows Server 2019) displays them under Process.

Has there been an breaking change by Microsoft?

Server-2019
Server-2022

Edit: Querying using powershell, it seems the Process Counter has been renamed ?!

# Windows Server 2019 (1809 Build 17763.4974)

PS C:\> Get-Counter -ListSet Proces* | Format-Table

CounterSetName        MachineName CounterSetType Description Paths
--------------        ----------- -------------- ----------- -----
Processor Information .            MultiInstance             {\Processor Information(*)\Performance Limit Flags, \Processor Information(*)\% Performance Limit, \Processor …
Processor             .            MultiInstance             {\Processor(*)\% Processor Time, \Processor(*)\% User Time, \Processor(*)\% Privileged Time, \Processor(*)\Int…
Process               .            MultiInstance             {\Process(*)\% Processor Time, \Process(*)\% User Time, \Process(*)\% Privileged Time, \Process(*)\Virtual Byt…

# Windows Server 2022 (21H2 Build 20348.2031)

PS C:\> Get-Counter -ListSet Proces* | Format-Table

CounterSetName        MachineName CounterSetType Description Paths
--------------        ----------- -------------- ----------- -----
Process V2            .            MultiInstance             {\Process V2(*)\Working Set - Private, \Process V2(*)\IO Other Bytes/sec, \Process V2(*)\IO Data Bytes/sec, …
Processor Information .            MultiInstance             {\Processor Information(*)\Performance Limit Flags, \Processor Information(*)\% Performance Limit, \Processo…
Processor             .            MultiInstance             {\Processor(*)\% Processor Time, \Processor(*)\% User Time, \Processor(*)\% Privileged Time, \Processor(*)\I…
@SupraOva
Copy link

I assume that one of the required counters is disabled.

Can you try this command (admin mode) and post the result :
lodctr.exe /Q | findstr /i "disable"

@Mario-Hofstaetter
Copy link
Contributor Author

I assume that one of the required counters is disabled.

Can you try this command (admin mode) and post the result : lodctr.exe /Q | findstr /i "disable"

PS C:\> lodctr.exe /Q | findstr /i "disable"
[Lsa] Performance Counters (Disabled)
[PerfProc] Performance Counters (Disabled)
PS C:\>

So it seems something is disabled, does this include the required Process counters for the collector?

@SupraOva
Copy link

Yes, so now try to activate these counters by running the command below and check if you still have any errors.

lodctr.exe /E:Lsa
lodctr.exe /E:PerfProc

@Mario-Hofstaetter
Copy link
Contributor Author

Yes, so now try to activate these counters by running the command below and check if you still have any errors.

lodctr.exe /E:Lsa
lodctr.exe /E:PerfProc

Thank you very much ❤️ @SupraOva , that does seem to do the trick. I followed those command with lodctr.exe /R , not sure if that was necessary,
but the process collector is now able to return metrics again.

Where do we go from here, I guess this should be put at least into the process collector docs?
Currently I don't have time for a quick PR, my apologies.

The setup installer cannot run this automatically, because not everybody uses the process collector though (for us, it is one of the most important ones).

@JDA88
Copy link
Contributor

JDA88 commented Jan 11, 2024

Found the information there :

For backwards-compatibility reasons, the "Process" counterset returns non-unique instance names based on the EXE filename. This can cause confusing results, especially when a process with a non-unique name starts up or shuts down, as this will typically result in data glitches due to incorrect matching of instance names between samples. Consumers of the "Process" counterset must be able to tolerate these non-unique instance names and the resulting data glitches. In Windows 11 and later, you can use the Process V2 counterset to avoid this problem.

It might be interesting to use Process V2 if availeable as Process is disabled by default no?

@jkroepke
Copy link
Member

jkroepke commented Jan 16, 2024

It might be interesting to use Process V2 if availeable as Process is disabled by default no?

image

The issue here that windows_exporter is using Registry API that the moment which not support any V2 providers.

See also: #1350

@JDA88
Copy link
Contributor

JDA88 commented Jan 16, 2024

It might be interesting to use Process V2 if availeable as Process is disabled by default no?

image

The issue here that windows_exporter is using Registry API that the moment which not support any V2 providers.

Ho, didn't know that, but it answers the question 😅

Copy link

This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.

@V1TA5
Copy link

V1TA5 commented Apr 24, 2024

For me the above Workaround didnt fix it.
The same setup works in 2019.
I came here via grafana alloy: grafana/alloy#658

@jkroepke
Copy link
Member

I'm currently working on a new collector which is used the windows performance data libraries. They are able to use v2 counters.

But I expect that it might take weeks develop the new feature.

@github-actions github-actions bot removed the Stale label Apr 25, 2024
@V1TA5
Copy link

V1TA5 commented May 3, 2024

Will this collector be commited to prometheus or its seperate thing?
Im Currently in the testing phase for a monitoring setup so ive some time till i need to decide on a solution.

@jkroepke
Copy link
Member

jkroepke commented May 3, 2024

here - #1459

The PR is not fully completed yet.

@rossi-fi
Copy link

rossi-fi commented May 4, 2024

Had a similar issue on Windows 10. None of os,cpu,memory,system,logical_disk etc. worked. Another symptom was getting a "Unable to add these counters" message when starting PerfMon.

Rebuilding the counters exactly as instructed here (Administrator prompt, cd to correct directories) solved it for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants