Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for ranges in the --accept option / config field #1167

Merged
merged 23 commits into from
Sep 17, 2023
Merged

feat: Add support for ranges in the --accept option / config field #1167

merged 23 commits into from
Sep 17, 2023

Conversation

Techassi
Copy link
Contributor

Adds support for accept ranges discussed in #1157. This allows the user to specify custom HTTP status codes accepted during checking and thus will report as valid (not broken). The accept option only supports specifying status codes as a comma-separated list. With this PR, the option will accept a list of status code ranges formatted like this:

accept = ["100..=103", "200..=299", "403"]

These combinations will be supported: ..<end>, ..=<end>, <start>..<end> and <start>..=<end>. The behaviour is copied from the Rust Range like concepts:

  • ..<end>, includes 0 to <end> (exclusive)
  • ..=<end>, includes 0 to <end> (inclusive)
  • <start>..<end>, includes <start> to <end> (exclusive)
  • <start>..=<end>, includes <start> to <end> (inclusive)

This draft PR only implements some foundation work for this feature. Once all functionality is added, the PR will be ready to review.

Copy link
Member

@mre mre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a draft, but I thought I might add some early feedback. Keep it up! 😃

lychee-lib/src/types/accept/range.rs Outdated Show resolved Hide resolved
lychee-lib/src/types/accept/range.rs Outdated Show resolved Hide resolved
lychee-lib/src/types/accept/range.rs Show resolved Hide resolved
lychee-lib/src/types/accept/selector.rs Outdated Show resolved Hide resolved
lychee-lib/src/types/accept/selector.rs Outdated Show resolved Hide resolved
lychee-lib/src/types/accept/selector.rs Outdated Show resolved Hide resolved
lychee-lib/src/types/accept/selector.rs Show resolved Hide resolved
@Techassi
Copy link
Contributor Author

Techassi commented Aug 1, 2023

The new accept syntax is now included in the CLI. I opted to use a string of comma-separated ranges instead of a Vec of ranges. This allows users to specify ranges like --accept "100..=199,200,404" instead of

--accept "100..=199" --accept 200 --accept 404

One of the tests keeps failing, which boggles my mind because the test case should be unaffected by my changes. Maybe you can take a look @mre

fixtures/configs/smoketest.toml Outdated Show resolved Hide resolved
@Techassi Techassi requested a review from mre August 6, 2023 13:11
@mre
Copy link
Member

mre commented Aug 17, 2023

Works for me.
BTW, it also works without quotes, it seems:

lychee --accept 100..=199,200,404 -- README.md

One thing that I would love to support would be a mix of both styles:

lychee --accept 100..=199,200 --accept 404 -- README.md

But I guess that's a trickier change.
We'd have to accept a Vec of AcceptSelector in the options and merge them. 🤔

The reason is, that I'm a bit afraid of breaking changes, but then again I don't know many people who set multiple accept args right now, so it might be fine?

@Techassi
Copy link
Contributor Author

Techassi commented Aug 18, 2023

Works for me. BTW, it also works without quotes, it seems:

lychee --accept 100..=199,200,404 -- README.md

Ah that's neat! I usually include quotes just to be sure...

One thing that I would love to support would be a mix of both styles:

lychee --accept 100..=199,200 --accept 404 -- README.md

But I guess that's a trickier change. We'd have to accept a Vec of AcceptSelector in the options and merge them. 🤔

The reason is, that I'm a bit afraid of breaking changes, but then again I don't know many people who set multiple accept args right now, so it might be fine?

Yeah that's definitely a harder change. However, just like you mentioned, the chance people use multiple --accept parameters is slim. So yes, we should be good to go.


I can also open a separate PR, which explores further improvements, like merging multiple AcceptSelectors and improving the internal merging of AcceptRanges in AcceptSelector.

@mre
Copy link
Member

mre commented Aug 26, 2023

We didn't add 403 to the range of accepted status codes yet, or?
That was at least my intention when I opened #1157.
Should we add 403 in this pull request?

@Techassi
Copy link
Contributor Author

No, not yet - but we can add it. How would we go about doing that? I think there are two ways:

  • Providing a Default impl for AcceptSelector and changing the accept argument to AcceptSelector instead of Option<AcceptSelector>. This also requires us to use the clap default value attribute.
  • We could provide a set of pre-defined accepted status codes here
    None => None,

@mre
Copy link
Member

mre commented Aug 27, 2023

Yeah, Default is the idiomatic way to go imho.

@Techassi
Copy link
Contributor Author

Okay, I will add this. Then we should be ready to merge!

@Techassi
Copy link
Contributor Author

Techassi commented Sep 3, 2023

Added Default and Display for AcceptSelector. The CLI now accepts AcceptSelector and uses default_value_t when the user doesn't provide a custom value.

@mre
Copy link
Member

mre commented Sep 5, 2023

Some failing unit tests. Apart from that looks great.

Okay this was quite a dig. Clap uses the default value not as is, but will
use `to_string` (from the Display impl) to convert the default value into
a string and then re-parses it via the `FromStr` trait. The `Display` impl
from `AcceptSelector` included surrounding square brackets `[ ]` which failed
to parse.

cargo expand really helped here, as I had to dig into the code generated by the Clap `Parser` derive macro.

I'm still unsure why Clap handles default values in this way, but there surely is a reason for it. I'm just too lazy to do further research into that.
@Techassi
Copy link
Contributor Author

Techassi commented Sep 9, 2023

Okay, this was quite a dig. Clap uses the default value not as is but will use to_string (from the Display impl) to convert it into a string and then re-parses it via the FromStr trait. See generated code here:

let arg = arg
    .help("A List of accepted status codes for valid links")
    .long_help(None)
    .short('a')
    .long("accept")
    .default_value({
        static DEFAULT_VALUE: clap::__derive_refs::once_cell::sync::Lazy<String> =
            clap::__derive_refs::once_cell::sync::Lazy::new(|| {
                let val: AcceptSelector =
                    <AcceptSelector as ::std::default::Default>::default();
                ::std::string::ToString::to_string(&val)
            });
        let s: &'static str = &*DEFAULT_VALUE;
        s
    })
    // ...

The Display impl from AcceptSelector included surrounding square brackets [ ], which failed to parse.

@mre
Copy link
Member

mre commented Sep 9, 2023

OMG, I would never have found that. Kudos!

@mre
Copy link
Member

mre commented Sep 9, 2023

The config test failed because fixtures/configs/cache.toml didn't have an accept entry.
I've extended the error message to make that easier to troubleshoot in the future:

Running `target/debug/lychee --config fixtures/configs/cache.toml .`
[ERROR] Error while loading config file "fixtures/configs/cache.toml": Failed to parse configuration file: TOML parse error at line 1, column 1
  |
1 | cache = true
  | ^^^^^^^^^^^^
missing field `accept`

The default doesn't get picked up, because there is no default value for serde deserialization.
Notably, this is different from clap's default for the command-line; it has to be set separately.

I removed accept from pub(crate) fn merge(&mut self, toml: Config) { ... } and added the serde default.
My guess is, that we could also remove some other stuff from merge, but I didn't touch that for this PR.

The final failing test will be a satisfying one, so I left it up to you. ❤️
It is about adding the new default to the config file. Check the failing test to see what I mean. 😉 🚀
Good job!

@Techassi
Copy link
Contributor Author

Techassi commented Sep 9, 2023

I've extended the error message to make that easier to troubleshoot in the future

Yeah, that's nice. I was kind of clueless about what exactly it complained about.

The final failing test will be a satisfying one, so I left it up to you. ❤️
It is about adding the new default to the config file. Check the failing test to see what I mean. 😉 🚀

Ha! Thanks for saving me the last error 😏

I will look into it!

@Techassi
Copy link
Contributor Author

All tests should now run without issues. However, the failing tests (3) succeeded perfectly on my local machine. They are all related to Wayback testing. The test cases even state that they are flaky.

Let me know how we want to proceed.

Copy link
Member

@mre mre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turned out really great!

The flaky tests are a bit of an issue, but let's tackle that at a later point.
I tried to fix that once with multiple retries, but it seems that doesn't always do the trick.
We might have to disable these checks or put them into a different, non-default group. 🤷‍♀️

The changes in this PR make a lot of sense and the code quality is great, so I don't mind merging and releasing it. 🚀 Thanks for the contribution!

@mre mre merged commit 1b1fd0c into lycheeverse:master Sep 17, 2023
6 of 7 checks passed
@Techassi
Copy link
Contributor Author

Thanks! I loved working on this. Can't wait to tackle future improvements :)

@nvuillam
Copy link

nvuillam commented Jan 7, 2024

This PR creates a breaking change between lychee 0.13.0 and 0.14.0 :(

#1338

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants