Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PSR1 doesn't check for file encoding #3841

Open
3 tasks done
lucraraujo opened this issue Jun 12, 2023 · 2 comments
Open
3 tasks done

PSR1 doesn't check for file encoding #3841

lucraraujo opened this issue Jun 12, 2023 · 2 comments

Comments

@lucraraujo
Copy link

lucraraujo commented Jun 12, 2023

The PSR1 standard stands that "Files MUST use only UTF-8 without BOM for PHP code".
There is no check for files using other encodings than UTF-8. The existing sniff checks for BOM in the files.
If a file is encoded with, for example. windows-1252 encoding and don't have BOM, the file check pass.

Steps to reproduce the behavior:

  1. Create a file called test.php with any code and file encoding different than UTF-8 and without BOM
  2. Run phpcs --standard=PSR1 test.php
  3. No errors are showed regarding file encoding

Expected behavior

There should be an errors regarding the file enconding.

Operating System Debian 11.7 Bullseye
PHP version 8.2.6
PHP_CodeSniffer version 3.7.2
Standard PSR1, PSR2, PSR12
Install type Composer local
  • I have searched the issue list and am not opening a duplicate issue.
  • I confirm that this bug is a bug in PHP_CodeSniffer and not in one of the external standards.
  • I have verified the issue still exists in the master branch of PHP_CodeSniffer.
@jrfnl jrfnl changed the title Problem with Generic.Files.ByteOrderMark sniff PSR1 doesn't check for file encoding Jun 12, 2023
@jrfnl
Copy link
Contributor

jrfnl commented Jun 12, 2023

The Generic.Files.ByteOrderMark is only intended to check for the byte order mark, it does not check the file encoding, so that sniff is working correctly.

What I believe you are trying to report is that there is no sniff checking if files are encoded as UTF-8.

While I do believe it can be checked what files claim to be encoded as, I do not believe it is possible to reliably verify that that claim is actually correct. I may well be wrong though and/or reality may have superseded the research I did in a distant past when I looked into something like this before.

I'll mark this as a feature request for now and would be interested to hear if someone has found a way to do this.

@lucraraujo
Copy link
Author

You're right. It's more a feature request than a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants