Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Introduce baseline system to suppress existing errors #119

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

softius
Copy link

@softius softius commented Apr 22, 2024

Summary

Declare currently reported errors as the "baseline", so that they are not being reported in subsequent runs. It allows developers to enable one or more rules as error and be notified when new ones show up.

Related Issues

Copy link

linux-foundation-easycla bot commented Apr 22, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

used. Be sure to define any new terms in this section.
-->

To keep track of the all the errors that we would like to ignore, we are introducing the concept of the baseline file; A JSON file containing the number of errors that must be ignored for each rule in each file. By design, the baseline is disabled by default and it doesn't affect existing or new projects, unless the baseline file is generated.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue with using a count is that it's possible to change the file without changing the count - i.e. you add as many new reports as you remove.

EG in this example the code has meaningfully changed, but the total number of reports is unchanged. So the file would not violate the baseline.

/// before

/* eslint no-console: error */
export function test(condition) {
  if (condition) {
    console.log();
  } else {
    return 1;
  }
}

/// after

/* eslint no-console: error */
export function test(condition) {
  if (condition) {
    return 1;
  } else {
    console.log();
  }
}

personally I've preferred more granular checks - eg the line + col as it's stricter. It's more likely to false-positive, but it's less likely to false-negative (which is more valuable, IMO).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, PHPStan creator here (where the baseline is implemented for the past almost 5 years). It's the most popular feature by far (yeah, the most popular feature of a static analyser is ignoring errors).

Turns out that the count is the right amount of granularity. Expecting "having this error in this file this many times is okay" is really useful. If you included the line number in the baseline entry, it'd get obsolete really fast. It'd essentially be an inline ignore, but in an external file. Not really practical.

If someone manages to remove an error but add it in a different place in the same file, good for them. Having a baseline means you're ignoring some errors on purpose, and this situation most likely means it's the same error, it just moved.

PHPStan is a little bit more granular becauses the baseline includes exact error messages being ignored, but this proposal only includes error types. Maybe that's something to consider.

An alternative approach that preserves line numbers would be to also include the git commit hash from the moment the baseline was generated. So "this error on line 12 in commit 34afddac". This would allow shifting expected errors around based on git diff. So three commits later, if there was a new line added on line 5, the baseline would expect the error to occur on line 13.

But that can become computationally expensive as the diff between the original commit and HEAD grows bigger (if the baseline is not regenerated for a long time).

Anyway, I'm really excited other programming tools to adopt the baseline concept, it's really useful!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point view, the baseline is an "agreement" between the developers and eslint that it is acceptable to have X number of errors, from a particular rule, for a particular file.

You are both right @ondrejmirtes @bradzacher , someone could fix one error and create another one or move it to a different line and this won't be reported. Personally, I don't see that as a problem. If that is necessary, maybe it can be resolved at the git / ci/cd level similarly to the suggestion provided by @bradzacher below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the error count is fine for a first pass at this feature.


Furthermore, we are adding one more reason to exit with an error code (see "Maintaining a lean baseline"). This might have some negative side-effects to wrapper scripts that assume that error messages are available when that happens. We could introduce a different exit code, to differentiate between exiting due to unresolved errors or ignored errors that do not occur anymore.

## Alternatives

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the alternative we've introduced at Canva is that we designate specific rules as "in migration" and we only consider reports from those rules if they exist in changed files (according to git comparison against the main branch).

With this system developers must address lint errors if they touch a file but otherwise they can be ignored.

This does require integration with the relevant source control system - though we've found it works quite well.

@nzakas nzakas added the Initial Commenting This RFC is in the initial feedback stage label Apr 22, 2024
@fasttime
Copy link
Member

Thanks for the RFC @softius. Can you please sign the CLA as requested in this comment?


### Generating the baseline

A new option `--generate-baseline` can be added to ESLint CLI. When provided, the baseline is generated and saved in `.eslint-baseline.json`. If the file already exists, it gets over-written. Note that this is a boolean flag option (no values are accepted). For example:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of PHPStan, this flag optionally accepts a filename. In some cases people generate multiple different baselines. It allows you to have a different baseline from the points in time where you enabled different rules.

It'd also mean that these baseline files would have to be referenced from the main eslint config because they couldn't be auto-discovered anymore. That's a potential downside.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. I was planning to include this initially in the RFC but I decided to leave it out eventually, mostly to avoid expanding the scope.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This baseline approach is also used by rubocop. The problem with these approaches is the baseline files have to be managed by humans and kept up to date.

As we built - our hold-the-line feature built into trunk check. Effectively does this dynamically. So nothing has to be maintained by humans as rules get enabled. Everything get's grandfathered in by any tool that integrates with our runner.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • @EliSchleifer It's impressive but I don't like it. Means the user doesn't have insight into pre-existing issues and cannot work to make the issues go away (make the baseline smaller, pay technical debt). Unless I'm missing something you either make the tool to have to look at diffs for the rest of your life, or turn the feature off one day and fix all the pre-existing issues at once (which will never happen).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've learned from --cache that people will want to have the ability to rename this or place it in a different location. I'd say let's include that capability from the start.

Copy link
Author

@softius softius May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nzakas I will make the necessary changes to the RFC to introduce two new flags --baseline and --baseline-location (instead of one flag called --generate-baseline), so that we maintain consistency with cache flags.

This should allow developers also to skip baseline if needed.

Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. It sound good in theory, but you haven't provided much in the way of implementation details. Please dig into the code a bit to see how this might be implemented.


To keep track of the all the errors that we would like to ignore, we are introducing the concept of the baseline file; A JSON file containing the number of errors that must be ignored for each rule in each file. By design, the baseline is disabled by default and it doesn't affect existing or new projects, unless the baseline file is generated.

Here is what the baseline file looks like. This indicates that the file `"src/app/components/foobar/foobar.component.ts"` has one error for the rule `@typescript-eslint/no-explicit-any` that is acceptable to be ignored.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a relative path. What is it relative to?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Having a better look at this, it seems that it will better to convert this to a full path.


### Generating the baseline

A new option `--generate-baseline` can be added to ESLint CLI. When provided, the baseline is generated and saved in `.eslint-baseline.json`. If the file already exists, it gets over-written. Note that this is a boolean flag option (no values are accepted). For example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've learned from --cache that people will want to have the ability to rename this or place it in a different location. I'd say let's include that capability from the start.


### Caching

Caching must contain the full list of detected errors, even those matched against the baseline. This approach has the following benefits:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What caching does this refer to? The --cache-generated file?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it refers to the cache created/used when --cache is specified.

Comment on lines +28 to +31
Explain the design with enough detail that someone familiar with ESLint
can implement it by reading this document. Please get into specifics
of your approach, corner cases, and examples of how the change will be
used. Be sure to define any new terms in this section.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like you missed the details this comment is asking for. Please look into the code to see how you'd implement such a feature.

used. Be sure to define any new terms in this section.
-->

To keep track of the all the errors that we would like to ignore, we are introducing the concept of the baseline file; A JSON file containing the number of errors that must be ignored for each rule in each file. By design, the baseline is disabled by default and it doesn't affect existing or new projects, unless the baseline file is generated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the error count is fine for a first pass at this feature.

```
{
"src/app/components/foobar/foobar.component.ts": {
"@typescript-eslint/no-explicit-any": 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To allow for extension in the future, I'd recommend using an object:

Suggested change
"@typescript-eslint/no-explicit-any": 1
"@typescript-eslint/no-explicit-any": {
count: 1
}


The suggested solution always compares against the baseline, given that the baseline file `.eslint-baseline.json` exists. This makes it easier for existing and new projects to adopt the baseline without the need to adjust scripts in `package.json` and CI/CD workflows.

This will go through each result item and message, and check each rule giving an error (`severity == 2`) against the baseline. If the file and rule are part of the baseline, means that we can remove and ignore the result message. This needs to take place after the fixes are applied so that we compare the non-fixable issues only. It also needs to take place before the error counting, so that the remaining errors are counted correctly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: we are only tracking problems set to the "error" severity and not anything set to "warn"? If so, please add this into the FAQ section.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, we are only counting errors when generating the baseline. Also only errors are considered when checking against the baseline.

@@ -33,26 +33,32 @@ This can be counterintuitive for enabling new rules as `error`, since the develo

To keep track of the all the errors that we would like to ignore, we are introducing the concept of the baseline file; A JSON file containing the number of errors that must be ignored for each rule in each file. By design, the baseline is disabled by default and it doesn't affect existing or new projects, unless the baseline file is generated.

Here is what the baseline file looks like. This indicates that the file `"src/app/components/foobar/foobar.component.ts"` has one error for the rule `@typescript-eslint/no-explicit-any` that is acceptable to be ignored.
Here is what the baseline file looks like. This indicates that the file `"/home/user/project/src/app/components/foobar/foobar.component.ts"` has one error for the rule `@typescript-eslint/no-explicit-any` that is acceptable to be ignored.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use an absolute path, then that means the file can't be checked in because it won't necessarily line up on other systems. Is that okay?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that would be problematic. Reverting back to the concept of relative paths for portability reasons, where the paths are relative to the CWD. This should cover well cases where multiple paths/globs are provided and seem consistent with other parts.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps relative to the flat config file is better?
The CWD can be somewhat problematic given that you may run eslint from any CWD and it'll figure things out as best as it can.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think defaulting to CWD-relative makes sense. People are probably going to run ESLint from the same location each time (the project root).

The flat config file may not always be in the CWD ancestry (if they use -c for instance), so that's unreliable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Initial Commenting This RFC is in the initial feedback stage
Projects
None yet
6 participants