Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

github linkcheck anchor change in 4.1.0 break some usage #9435

Closed
alex opened this issue Jul 12, 2021 · 11 comments
Closed

github linkcheck anchor change in 4.1.0 break some usage #9435

alex opened this issue Jul 12, 2021 · 11 comments

Comments

@alex
Copy link
Contributor

alex commented Jul 12, 2021

Describe the bug

Given a link like:

.. _`OpenSSL's test vectors`: https://github.com/openssl/openssl/blob/97cf1f6c2854a3a955fd7dd3a1f113deba00c9ef/crypto/evp/evptests.txt#L232 

in a github doc, with release 4.1.0 this will fail with linkcheck, while previously it worked.

How to Reproduce

$ git clone https://github.com/pyca/cryptography
$ cd cryptography
$ tox -e docs-linkcheck

Expected behavior

It passes.

Your project

https://github.com/pyca/cryptography

Screenshots

No response

OS

Linux

Python version

3.9.5

Sphinx version

4.1.0

Sphinx extensions

No response

Extra tools

No response

Additional context

The relevant change happened in 92335bd

Failing test logs: https://github.com/pyca/cryptography/runs/3046691393

@alex
Copy link
Contributor Author

alex commented Jul 14, 2021

A simple fix here would be for the linkcheck logic to simply exclude links of the form L\d+.

But of course really the current logic is broken for any link that's generated by github itself and thus doesn't have this prefix.

@sbesson
Copy link
Contributor

sbesson commented Jul 15, 2021

But of course really the current logic is broken for any link that's generated by github itself and thus doesn't have this prefix.

Agreed. Here's a subset of native GitHub anchor prefixes that are currently broken in Sphinx 4.1.0: #L, #issuecomment-, #pullrequestreview-, #commits-pushed-, #event-, #ref-commit-, #ref-pullrequest. My feeling is that it's going to be hard to maintain an exhaustive list of valid prefixes especially since there does not seem to be any reference authoritative GitHub page listing all of these.

@alex
Copy link
Contributor Author

alex commented Jul 15, 2021

Yes, it's a pretty unfortunate circumstance. It's not at all surprising github doesn't make a list of these, I haven't seen any website with a list of their anchor refs!

ATM I'm interested in whatever solution will get this back to a working state as quickly as possible.

@sbesson
Copy link
Contributor

sbesson commented Jul 15, 2021

Completely untested but my understanding of the original PR is that it attempts to fix the linkchecking of anchors in the scenario of a rendered rST/Markdown file. As an immediate patch, would it help to add an additional check to the rewrite_github_anchor method of the like of the parsed.path.endswith('.rst') or parsed.path.endswith('.md')? Happy to open a PR to that effect.

@tk0miya tk0miya added this to the 4.1.2 milestone Jul 15, 2021
@tk0miya
Copy link
Member

tk0miya commented Jul 15, 2021

OMG! I must admit our approach in #9016 is wrong. So we have to reconsider the way to check the anchors in reST/MD files. Until then, we have to fix the broken linkcheck-builder. I think it's better to disable rewrite_github_anchor handler for a while.

@alex
Copy link
Contributor Author

alex commented Jul 15, 2021

That'd be fine with me :-)

@astrojuanlu
Copy link
Contributor

In #9260 a new linkcheck-process-uri event was added, which I think it was a good idea. The problem here is that the current rewrite_github_anchor is insufficient - perhaps it should be moved to an extension that folks can install separately?

tk0miya added a commit to tk0miya/sphinx that referenced this issue Jul 18, 2021
The approach of `rewrite_github_anchor` makes some anchors valid.  But
it also makes other kind of anchors invalid.  This disables the handler
to make them valid again (while 4.1.x release).
@tk0miya
Copy link
Member

tk0miya commented Jul 18, 2021

I don't think so. The rewrite_github_anchor's approach is rewriting the anchor of the URLs to correct one. It means the handler must know all kinds of the form of anchors. But the format is not in public. So I think it's difficult to maintain the module. Another idea to resolve the anchors is 1) checking the original anchor exists, and 2) retry the modified anchor when not found.

tk0miya added a commit that referenced this issue Jul 19, 2021
Fix #9435: linkcheck: Failed to check anchors in github.com
@tk0miya tk0miya closed this as completed Jul 19, 2021
@alex
Copy link
Contributor Author

alex commented Jul 19, 2021

Looks like everything in the 4.1.2 milestone is complete. Should we expect a release soon?

@tk0miya
Copy link
Member

tk0miya commented Jul 20, 2021

@alex Please wait a moment. I did not read other issues yet. Some issues might be added to the milestone after that. But I understand this issue is also important. So I'll ship 4.1.2 this weekend even if some issues will not be resolved.

@alex
Copy link
Contributor Author

alex commented Jul 20, 2021 via email

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants