Support deep links #8

wooorm · 2018-08-23T18:16:36Z

I’m not sure if it should be in this project (or in check-links), but maybe it’s of interest for something to support URLs like:

https://github.com/davidtheclark/remark-lint-no-dead-urls#readme

As in, parse the HTML and check if there’s an element with an id attribute set to readme.

Wondering what you think, @davidtheclark / @transitive-bullshit!

The text was updated successfully, but these errors were encountered:

transitive-bullshit · 2018-08-23T18:26:47Z

We should definitely try to support this, as it seems like a common use case, though I'm not sure specifically what's being proposed by "deep links".

If your example URL is included in a markdown link, definition, or image node, it will be passed to check-links, which I just tested and it returns successfully as expected:

{ 'https://github.com/davidtheclark/remark-lint-no-dead-urls#readme': { status: 'alive', statusCode: 200 } }

Can you clarify what you mean by parsing the HTML?

wooorm · 2018-08-23T18:32:03Z

@transitive-bullshit #hash in URLs is not sent to servers. So you can’t test, with HTTP, whether it works. Before sending the request the #hash is stripped off.

However, I’m sure the #foobarbaz instead of #readme will also report as alive.
Even though when browsing to that URL it would show the top of the screen (where I would previously, say the thing with that ID used to exist, expect it to scroll down).

Use case: I’m writing some specs in markdown, on GitHub. So automatic IDs are generated for each heading. I’m changing headings (so other IDs are going to be used)
I’d like other things depending on this spec to be able to lint for unavailable IDs that used to work, but don’t anymore.

transitive-bullshit · 2018-08-23T18:32:38Z

Ahhh, that make sense :)

transitive-bullshit · 2018-08-23T18:42:06Z

This definitely seems like useful functionality. With that being said, I'm not sure how one would do this robustly without adding something like puppeteer to the mix since the target page could be JS rendered. That seems like a pretty heavy-weight solution for a remark lint plugin.

A few ideas on how we could implement this:

We support this functionality in this plugin via an optional param that defaults to false and an optional peer dep on puppeteer.
We support this functionality in a separate plugin, something like remark-lint-valid-deep-anchors.
We punt on this and rely on the end-user to perform additional validation outside the scope of the input document.

In all 3 options, we could encapsulate this functionality into a separate module, check-deep-links or check-deep-anchors.

Thoughts?

wooorm · 2018-08-23T18:48:47Z

Yeah, I think it would be pretty involved, hence my idea for another project.
Another thing is that we’d support some caching option (e.g., a week, or so, for this stuff). I think it’s going to take a while.

My use case is remark (or rehype), of course, but it makes sense to build an underlying thing of course, like check-links.

All your proposals make sense to me!

davidtheclark · 2018-09-02T18:56:19Z

I agree that this would be very useful. Using rehype to parse HTML and look for matching headings seems reasonable to me, and check-links seems like the place to do it, right? (But definitely behind a flag, since it would slow things down significantly.)

Parsing static HTML would work great for the use cases I'd have in mind and would avoid all the complexity of puppeteer. If the rationale for puppeteer is that sometimes JS might insert headings, doesn't that situation introduce other problems — like, the browser won't end up deep linking anyway because the heading is not present in the static HTML?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support deep links #8

Support deep links #8

wooorm commented Aug 23, 2018

transitive-bullshit commented Aug 23, 2018

wooorm commented Aug 23, 2018

transitive-bullshit commented Aug 23, 2018

transitive-bullshit commented Aug 23, 2018

wooorm commented Aug 23, 2018

davidtheclark commented Sep 2, 2018

Support deep links #8

Support deep links #8

Comments

wooorm commented Aug 23, 2018

transitive-bullshit commented Aug 23, 2018

wooorm commented Aug 23, 2018

transitive-bullshit commented Aug 23, 2018

transitive-bullshit commented Aug 23, 2018

wooorm commented Aug 23, 2018

davidtheclark commented Sep 2, 2018