Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OsString for process names #1196

Merged
merged 1 commit into from Mar 18, 2024

Conversation

CryZe
Copy link
Contributor

@CryZe CryZe commented Jan 4, 2024

Previously the stat file on Linux was parsed under the assumption that it contains valid UTF-8. This is not necessarily the case for two reasons:

  1. Process names don't need to be valid UTF-8 at all on Linux. They can contain invalid bytes.
  2. Even if the process name was valid UTF-8, the assumption breaks because the process name is limited to 15 characters, which can cut a valid code point in half.

This causes the stat file to not be parseable, causing the entire process not to show up in the list. So even if you aren't using the name of the process at all, you can't find it as part of the list.

One solution is to lossily convert the process name to a string. However this means that the Unicode replacement character is now part of those process names. The character when encoded as UTF-8 is 3 bytes. This means that process names can now be longer than 15 bytes, or in other words, doing a name based comparison on the first 15 bytes, such as suggested in the documentation of the crate, is no longer easily possible.

The solution is to just provide the name as an OsString instead, keeping the bytes around as is.

Resolves #1190

@CryZe
Copy link
Contributor Author

CryZe commented Jan 7, 2024

As previously mentioned in the other PR, the ergonomics of OsStr[ing] aren't perfect yet, and I wanted to advocate for a display method. Turns out there's an ACP now: rust-lang/libs-team#326

@GuillaumeGomez
Copy link
Owner

Just so you know: didn't forget, just very busy. I'll try to come back to your PR in the next weeks or so.

@GuillaumeGomez
Copy link
Owner

Sorry for the delay, looks all good to me! Can you fix the merge conflicts please? Then I'll merge. If I remember correctly you wanted to make similar changes in other parts of the API? If so after this PR is merged, please send what's next. :)

Previously the stat file on Linux was parsed under the assumption that
it contains valid UTF-8. This is not necessarily the case for two
reasons:
1. Process names don't need to be valid UTF-8 at all on Linux. They can
   contain invalid bytes.
2. Even if the process name was valid UTF-8, the assumption breaks
   because the process name is limited to 15 characters, which can cut a
   valid code point in half.

This causes the stat file to not be parseable, causing the entire
process not to show up in the list. So even if you aren't using the name
of the process at all, you can't find it as part of the list.

One solution is to lossily convert the process name to a string. However
this means that the [Unicode replacement character
`�`](https://www.fileformat.info/info/unicode/char/fffd/index.htm) is
now part of those process names. The character when encoded as UTF-8 is
3 bytes. This means that process names can now be longer than 15 bytes,
or in other words, doing a name based comparison on the first 15 bytes,
such as suggested in the documentation of the crate, is no longer easily
possible.

The solution is to just provide the name as an `OsString` instead,
keeping the bytes around as is.
@CryZe
Copy link
Contributor Author

CryZe commented Mar 18, 2024

Alright, rebased and CI is green.

@GuillaumeGomez
Copy link
Owner

Thanks!

@GuillaumeGomez GuillaumeGomez merged commit 0da2f40 into GuillaumeGomez:master Mar 18, 2024
67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Process List incomplete on long names
2 participants