Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish list of known fragment identifiers #1198

Open
foolip opened this issue Mar 27, 2024 · 4 comments
Open

Publish list of known fragment identifiers #1198

foolip opened this issue Mar 27, 2024 · 4 comments

Comments

@foolip
Copy link
Member

foolip commented Mar 27, 2024

This would be helpful for web-platform-dx/web-features#84, to be able to create a spec URL validator that checks if a URL like https://w3c.github.io/webrtc-pc/#dom-datachannel-binarytype is a good spec URL.

A similar problem is solved in Bikeshed by downloading the data directly from GitHub:
https://github.com/speced/bikeshed/blob/584813e6380533a19c6656594c810bf974854e68/bikeshed/update/updateCrossRefs.py#L236

For something that should go into a CI check, that's not good though, since the build could break at any time.

@tidoust
Copy link
Member

tidoust commented Mar 28, 2024

This would be helpful for web-platform-dx/web-features#84, to be able to create a spec URL validator that checks if a URL like https://w3c.github.io/webrtc-pc/#dom-datachannel-binarytype is a good spec URL.

The list of fragment identifiers appears in the ids extracts. For example, the URL you suggest as example appears in the WebRTC id extract.

For something that should go into a CI check, that's not good though, since the build could break at any time.

We could create an NPM package but I'm wondering how that would solve "could break at any time". Could you clarify?

If we go ahead with a package, I wonder about the frequency of releases and about guarantees. We don't do any data curation on fragment identifiers (and if we could avoid doing additional curation, I think we wouldn't mind ;)). We could automate the publication of the package but the list of fragment identifiers changes frequently. Should we publish a package one or more times per day? Or should we restrict publications to, say, once per week?

@foolip
Copy link
Member Author

foolip commented Mar 28, 2024

These are good questions. The important part to avoid sudden breakage of CI is that the IDs are pinned in some way. An NPM package makes that easy and allows depending on Dependabot. But it can also be done by pointing to a specific webref commit, perhaps using it as a submodule.

The release cadence is a good question. I guess roughly weekly would be OK. And I agree that it would be fantastic to not have to review changes to identifiers at all or make many guarantees, just expose the same stuff that Bikeshed uses.

This isn't urgent at all BTW, it's a nice-to-have.

@tidoust
Copy link
Member

tidoust commented Mar 28, 2024

It suddenly occurs to me that looking at the full list of fragment identifiers is probably not a good idea in any case: the "pinning" mechanism you describe is also the sort of stability that specs need when they reference some other spec. This is what led to exported definitions. Ideally, features would only link to exported definitions... and likely section headings. In any case, links to internal definitions and other IDs should be discouraged.

The data's already in Webref too, in dfns and headings extracts.

We have tools in place that detect broken links (w3c/strudy) from Webref data and report them automatically. We could also detect changes earlier on in Webref. In the end, we could perhaps create a package that contains stable fragment identifiers (exported definitions and section ids), and use some semver logic to report breaking changes:

  • patch increment: new fragment identifiers added
  • minor increment: some fragment identifiers disappeared
  • major increment: major data structure change
    (or major increment for any fragment identifier change)

@foolip
Copy link
Member Author

foolip commented Mar 28, 2024

Good point about not all IDs being good feature links, I didn't even consider other linkable things like examples and whatnot.

I strongly suspect that doing this will reveal lots of things that aren't exported but should be, and that it will be a bit of a slog.

But I like the approach!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants