Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support deep links #8

Open
wooorm opened this issue Aug 23, 2018 · 6 comments
Open

Support deep links #8

wooorm opened this issue Aug 23, 2018 · 6 comments

Comments

@wooorm
Copy link
Member

wooorm commented Aug 23, 2018

I’m not sure if it should be in this project (or in check-links), but maybe it’s of interest for something to support URLs like:

https://github.com/davidtheclark/remark-lint-no-dead-urls#readme

As in, parse the HTML and check if there’s an element with an id attribute set to readme.

Wondering what you think, @davidtheclark / @transitive-bullshit!

@transitive-bullshit
Copy link
Contributor

We should definitely try to support this, as it seems like a common use case, though I'm not sure specifically what's being proposed by "deep links".

If your example URL is included in a markdown link, definition, or image node, it will be passed to check-links, which I just tested and it returns successfully as expected:

{ 'https://github.com/davidtheclark/remark-lint-no-dead-urls#readme': { status: 'alive', statusCode: 200 } }

Can you clarify what you mean by parsing the HTML?

@wooorm
Copy link
Member Author

wooorm commented Aug 23, 2018

@transitive-bullshit #hash in URLs is not sent to servers. So you can’t test, with HTTP, whether it works. Before sending the request the #hash is stripped off.

However, I’m sure the #foobarbaz instead of #readme will also report as alive.
Even though when browsing to that URL it would show the top of the screen (where I would previously, say the thing with that ID used to exist, expect it to scroll down).

Use case: I’m writing some specs in markdown, on GitHub. So automatic IDs are generated for each heading. I’m changing headings (so other IDs are going to be used)
I’d like other things depending on this spec to be able to lint for unavailable IDs that used to work, but don’t anymore.

@transitive-bullshit
Copy link
Contributor

Ahhh, that make sense :)

@transitive-bullshit
Copy link
Contributor

This definitely seems like useful functionality. With that being said, I'm not sure how one would do this robustly without adding something like puppeteer to the mix since the target page could be JS rendered. That seems like a pretty heavy-weight solution for a remark lint plugin.

A few ideas on how we could implement this:

  1. We support this functionality in this plugin via an optional param that defaults to false and an optional peer dep on puppeteer.
  2. We support this functionality in a separate plugin, something like remark-lint-valid-deep-anchors.
  3. We punt on this and rely on the end-user to perform additional validation outside the scope of the input document.

In all 3 options, we could encapsulate this functionality into a separate module, check-deep-links or check-deep-anchors.

Thoughts?

@wooorm
Copy link
Member Author

wooorm commented Aug 23, 2018

Yeah, I think it would be pretty involved, hence my idea for another project.
Another thing is that we’d support some caching option (e.g., a week, or so, for this stuff). I think it’s going to take a while.

My use case is remark (or rehype), of course, but it makes sense to build an underlying thing of course, like check-links.

All your proposals make sense to me!

@davidtheclark
Copy link
Contributor

I agree that this would be very useful. Using rehype to parse HTML and look for matching headings seems reasonable to me, and check-links seems like the place to do it, right? (But definitely behind a flag, since it would slow things down significantly.)

Parsing static HTML would work great for the use cases I'd have in mind and would avoid all the complexity of puppeteer. If the rationale for puppeteer is that sometimes JS might insert headings, doesn't that situation introduce other problems — like, the browser won't end up deep linking anyway because the heading is not present in the static HTML?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants