Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore some packages ? #657

Open
bodinsamuel opened this issue Jul 12, 2021 · 7 comments
Open

Ignore some packages ? #657

bodinsamuel opened this issue Jul 12, 2021 · 7 comments
Labels

Comments

@bodinsamuel
Copy link
Contributor

Just a proposal, I'm not even convinced this is a good idea.
The only thing I want to do is gaining some time in the indexing, and the packages that are the less relevant are usually the slowest because there is no cache anywhere.
We could:

  • exclude them entirely
  • add them without additional data, to the index with a flag
  • process them like every other with a flag
  • do nothing

Security holded

e.g: https://www.npmjs.com/package/kamonetucbp
They do not provide any value and cache miss everywhere obviously.

Test user and test packages

I don't have a very exhaustive list, but for example Ryan (ts maintainer) has created a lot of package to test DefTyped release.
https://www.npmjs.com/~ryancavanaugh

I also saw a lot of tests and repetitive pattern, but not sure how to catch them without ignoring others too.

Old packages

Probably not a good idea to ignore old packages (> 6 years old), some well appreciated packages do not receive update often.

@Haroenv
Copy link
Collaborator

Haroenv commented Jul 12, 2021

For packages that are security held, we can indeed probably bail out on most processing, and stop after formatPkg, possibly with a special flag in searchInternal to mention that we stopped processing.

For older packages, I wonder if we could do a cache ourselves (aka the bootstrap index)? However this means downloads for example don't get updated, which I don't think is a good idea 🤔

@MartinKolarik
Copy link
Collaborator

I was just about to file an issue for the security holding packages since it's also related to #941. Those don't have any extra flags (compared to e.g. deprecated) so they don't get any search penalization. I would say that if a package points to https://github.com/npm/security-holder/, it should get a flag and be pushed down in the search results.

@Haroenv
Copy link
Collaborator

Haroenv commented May 13, 2022

good call @MartinKolarik, it can be marked as deprecated: "held by security" so that it has the same penalty as a deprecated package. Maybe it can also have the alternativeNames not calculated so you're not likely to receive it. Would you be willing to make a PR?

@MartinKolarik
Copy link
Collaborator

@Haroenv sure. With the alternative names condition we're getting to the #951 territory so I can add the other conditions too (e.g. downloads) and make a PR for overall relevance improvement.

@bodinsamuel
Copy link
Contributor Author

A dedicated boolean would be great ☺️

@MartinKolarik
Copy link
Collaborator

So a separate boolean or deprecated with explanation in deprecatedReason as @Haroenv suggested? Boolean seems cleaner but we'll have to add it to index config too then.

@Haroenv
Copy link
Collaborator

Haroenv commented May 13, 2022

either is fine really, dedicated key makes sense indeed. Maybe in _searchMetadata

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants