Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link check fails for sites with Cloudflare WAF #72

Open
andrewvaughan opened this issue Jan 22, 2024 · 0 comments
Open

Link check fails for sites with Cloudflare WAF #72

andrewvaughan opened this issue Jan 22, 2024 · 0 comments

Comments

@andrewvaughan
Copy link

andrewvaughan commented Jan 22, 2024

Per a lengthy discovery session over at https://github.com/oxsecurity/megalinter:

oxsecurity/megalinter#3304

I believe I've discovered an issue where TLS-ALPN-01 verification is resulting in 403 responses from servers (a pattern I've seen with LetsEncrypt/Cloudflare - but I don't want to say it's that, by any means). Given the popularity of those services, it may limit the usefulness of the tool.

Per my comments in the linked issue, I did some pretty deep digging, but only came to find that I was able to resolve invalid 403 checks on these servers (specifically, stackoverflow.com) by disabling ALPN (and naturally reverting to a HTTP/1.1 protocol):

# (You can use any alpine-based container for this, including the `markdown-link-check` official)
docker run --entrypoint /bin/bash -it --rm oxsecurity/megalinter-python:v7.7.0

USER_AGENT="Mozilla/5.0 (compatible; markdown-link-check/3.11.2; MegaLinter/7.7.0; +https://megalinter.io)"
URL="https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem/66378#66378"

# Fails using HTTP/2
$ curl -I -A "$USER_AGENT" "$URL"
HTTP/2 403
...

# Will not work just reverting to HTTP/1.1
$ curl --http1.1 -I -A "$USER_AGENT" "$URL"
HTTP/1.1 403 Forbidden
# ...

# Succeeds disabling ALPN
$ curl --no-alpn -I -A "$USER_AGENT" "$URL"
HTTP/1.1 200 OK
# ...

There are a few hints in the response, though:

cf-mitigated: challenge appears as a response header in both the first and second examples, above, meaning that the Cloudflare WAF has injected a challenge into the URL, believing to have detected a bot.

One option may simply to give the user a configuration option to "ignore Cloudflare challenges" if this response header appears. That said, I did notice - for whatever reason - this was often bypassed if ALPN is disabled.

It's not clear to me how or why ALPN prevents cloudflare's WAF from kicking in, but it seems like an easy (if not temporary) fix.

I might recommend to check for a cf-mitigated: challenge response header, or, potentially, for all 403 responses, and retrying the check with ALPN disabled. I'm not 100% sure how the needle browser might support this, but it's definitely worth looking into, as WAFs are becoming more and more common. This would be a more future-proof solution, as I think Cloudflare missing bots on non-ALPN requests is probably more of a bug than a feature.

That said, it wouldn't be a bad idea to implement, given the popularity of CF:

It does seem that some options can be managed:
https://www.npmjs.com/package/needle#nodejs-tls-options

I know that one of the first things in the TLS connection is the client offers ALPN options, so my hope is there is a way to disable that. That way, the server is not able to select one.

Per the TLS documentation:
https://nodejs.org/docs/latest/api/tls.html#alpn-and-sni

It does seem like there is some level of control after v5.0.0:
https://nodejs.org/docs/latest/api/tls.html#new-tlstlssocketsocket-options

Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant