Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

self-closing tag report a warning #1433

Closed
pchampin opened this issue Sep 29, 2022 · 22 comments
Closed

self-closing tag report a warning #1433

pchampin opened this issue Sep 29, 2022 · 22 comments

Comments

@pchampin
Copy link

Disclaimer: this issue is about a warning, not an error. But that's the best category I could find for it.

URL being validated or code to reproduce error: https://perso.liris.cnrs.fr/pierre-antoine.champin/

Warning: Self-closing tag syntax in text/html documents is [widely discouraged](https://google.github.io/styleguide/htmlcssguide.html#Document_Type); it’s unnecessary and [interacts badly with other HTML features](https://software.hixie.ch/utilities/js/live-dom-viewer/?saved=10809) (e.g., unquoted attribute values). If you’re using a tool that injects self-closing tag syntax into all void elements, [without any option to prevent it from doing so](https://github.com/prettier/prettier/issues/5246), then consider switching to a different tool.

From line 4, column 5; to line 4, column 28

head>↩    <meta charset="utf-8" />↩    <

As a teacher, I encourage my students to use self-closing tags, because they make the structure of the document clearer for the novice reader. And I feel entitled to do so, given that this is allowed by the HTML5 syntax.
I also encourage my students to use the validator for all their pages, and to aim for no errors and no warnings. Hence my conundrum.

I would argue that this warning is overly opinionated.

Claiming that this syntax is "widely discouraged", and pointing to guidelines of a single browser vendor, is not compelling.

Granted, the bad interaction with unquoted attributes may be a problem. Why not issue a warning in those specific cases instead?

@sideshowbarker
Copy link
Contributor

sideshowbarker commented Sep 29, 2022

Hi Pierre-Antoine,

Thanks for raising this.

I would argue that this warning is overly opinionated.

I’m very open to refining the wording of that warning

Can you suggest some alternative wording that would make it less opinionated?

As a teacher, I encourage my students to use self-closing tags, because they ake the structure of the document clearer for the novice reader.

Please see https://github.com/orgs/mdn/discussions/242#discussioncomment-3749398

And I feel entitled to do so, given that this is allowed by the HTML5 syntax. I also encourage my students to use the validator for all their pages, and to aim for no errors and no warnings.

https://validator.w3.org/nu/about.html#why-validate is something that you might want to have students read and understand. The purpose of this tool is to help them catch unintended mistakes — mistakes they might have otherwise missed — so that they can fix them.

Another possible teaching point to make is that in some cases the checker may emit warnings — or even errors — about some markup cases that they (students) are using intentionally and so that are not unintended mistakes. And in those cases they have the right and the power to ignore those errors — and the checker even gives them the power to persistently filter out those errors so that they’ll never see them again, even when checking any future documents:

screen recording that shows how to use the Message Filtering button to persistently filter out messages

There are other warnings which the checker emits that are also essentially discretionary warnings. And even many of the cases that are defined in the HTML spec as outright errors are essentially judgement calls — there’s not 100% universal agreement that the cases should be considered errors. There are a range of opinions, and none of this is written in stone.

So students should understand this is one tool among many other tools — in particular, among other kinds of static-analysis tools. And as with messages reported by other kinds of static-analysis tools — such as, say compiler warnings — they may sometimes find that the tool is warning them about something that they are already aware of and are using in spite of the fact they know it’s going to cause the tool (compiler, etc.) to generate a warning. And that can be OK.

Claiming that this syntax is "widely discouraged", and pointing to guidelines of a single browser vendor, is not compelling.

It links to only one style guide because — given that the message is already relatively long — it would be unwieldy to try to link to multiple other style guides in that one message, and it seems sufficient to link to one representative guide that’s very widely used.

And it’s worth mentioning that particular guide governs the formatting of all code examples at the https://web.dev/ site — which, along with the MDN Web Docs site — is one of the most widely-used sources of teaching materials and how-to docs for web developers. So all of the examples on the https://web.dev/ site that have void elements don’t use self-closing-tag syntax in the examples. Which implicitly makes the point.

All that said, rather than linking to that style guide, I would have instead made the message link to a similar admonition in the MDN Web Docs style guide:

Don't include XHTML-style trailing slashes for empty elements because they're unnecessary and slow things down.

…but unfortunately, as can be seen from mdn/content#20523, that admonition was removed from that style guide because the project wants to run the Prettier formatting tool on every single HTML snippet in MDN articles — but Prettier is hard-coded to change all void elements to have self-closing-tag syntax, and doesn’t provide any option for users who don’t like that behavior and would prefer to keep their void elements as-is. So rather than keeping that long-standing guidance, a decision was made to remove it so that the project could use Prettier without conflicting with the project’s own existing style guide.

@phil-pickering
Copy link

Hi Michael,

I'm afraid I'm with @pchampin on this one.

Section 13.1.2.1 Start tags of the HTML Living Standard states:

  1. Then, if the element is one of the void elements, or if the element is a foreign element, then there may be a single U+002F SOLIDUS character (/). This character has no effect on void elements, but on foreign elements it marks the start tag as self-closing.

Hence, it is completely valid HTML to use a self-closing tag for all void elements. As the job of the validator is to point out whether the HTML code used on a web page is either valid or invalid, it seems peculiar that the validator would warn against perfectly valid HTML.

Even if the notion of issuing a warning could be justified, I have a number of issues with the warning itself:

  1. You use the phrase "widely discouraged" and link to the Google HTML/CSS Style Guide. I would argue strongly that a style guide produced by a single browser vendor hardly merits the phrase "widely discouraged", especially when said browser vendor doesn't always follow its own style guide (see https://learndigital.withgoogle.com/digitalgarage/ or https://www.grow.google).

  2. You have attempted to justify the above link, by referring to the excellent web.dev site. However, if you look at the Learn HTML course on that site, you will see this in the Overview section:

Optionally, you can include a slash at the end of the tag, which many people find makes markup easier to read. Continuing with this example, we self close the tag with a slash:

<input type="range"/>
<img src="switch.svg" alt="light _switch"/>

The slash at the end is old school: it's a way of indicating that the element is self-closing and there will be no matched end or closing tag.

  1. You then express your disappointment at not being able to link to a recently removed warning on MDN. This is actually a blessing, because Prettier issues aside, that warning contains a dubious assertion and was factually incorrect. The warning said that trailing slashes "slow things down" - if they meant performance-wise, then this would have a negligible effect in the extreme; if they meant that it takes longer to type, then I think they're really clutching at straws. And regarding the issue of "they can break old browsers", the warning is just wrong. A single browser (Netscape 3 in 1996) supported self-closing tags, but a bug meant it did require a space before the closing slash. That's why so many examples of self-closing tags around today still use that space.

  2. Another part of the warning mentions that it "interacts badly with other HTML features (e.g., unquoted attribute values)", yet if you were following the Google HTML/CSS Style Guide (which you appear to hold in high regard) you would already be quoting your attribute values anyway. And like I mentioned above, many developers who continue to use self-closing tags, always add a space before the closing slash which prevents this issue from happening whether the attribute values are quoted or unquoted.

  3. My final issue is the very unfair dig at Prettier. I can understand why you might think the validator has a responsibility to include an element of subjectivity over some topics, but this is a going a bit far. Even the Prettier user who opened the GitHub issue you linked to has subsequently changed their mind. The whole point of Prettier is to allow development teams to adopt a universal coding style without all the usual arguments over subjective topics like the use of self-closing tags. And whilst I don't use it myself, well over half of front-end developers do, and this means that instead of "switching to a different tool" they're more likely to stop using the validator then dump Prettier.

And one final point, which I'll leave it to you as to whether you think it's relevant or not.

XML is still a thing, and if you're working with HTML code and anything XML-related, you still need to use self-closing tags. For example, JSX can be used to write HTML in React. JSX follows XML rules, so every element needs to be closed (including void elements). There are already a great number of React developers out there with only a minimal understanding of HTML. It would be a tremendous shame if the few React developers who actually use the validator are rewarded with constant spurious warnings about their perfectly valid HTML.

@sideshowbarker
Copy link
Contributor

sideshowbarker commented Oct 4, 2022

Section 13.1.2.1 Start tags of the HTML Living Standard states:

  1. Then, if the element is one of the void elements, or if the element is a foreign element, then there may be a single U+002F SOLIDUS character (/). This character has no effect on void elements, but on foreign elements it marks the start tag as self-closing.

Hence, it is completely valid HTML to use a self-closing tag for all void elements.

Per the spec, your statement above is not strictly true, actually. At least not as currently worded.

The reason is, the spec says that the solidus marks the start tag as self-closing only for foreign elements — but for void elements, the spec explicitly does not say that it marks the start tag as self-closing.

And that wording in the spec is not a mistake or oversight. The spec is very intentionally worded like that, to exclude the use of the solidus on void elements to mark them as self-closing.

That said, it is still is actually valid to use the solidus with void-element start tags for literally any purpose except to mark the start tag as self-closing. For example, “I think it looks better to have solidus at the end of void-element start tags” is, per-spec, a valid reason to put a solidus in.

So as far as the checker behavior goes, if it were somehow possible for the checker to determine that a document were using a solidus on a void-element start tag to mark it as self-closing, then the checker could emit an error for that — since per-spec that’s not valid.

But of course the checker, given just a document in isolation from the author, has no way to determine what purpose the author had for putting a solidus into a particular void-element start tag.

And so that’s why the checker emits a warning: To help the author understand that, per-spec, using a solidus in a void-element start tag is a different case than using one in the start tag of an SVG or MathML.

But all that said, your comments have made me realize that the checker message needs to be doing more to help authors — by pointing authors to an explanation along the lines of what I outlined above.

So when I can make time (hopefully later today), I plan to add a page to this repo’s wiki, with a detailed explanation along the lines above — and with any other helpful info I can think to fit into that page — and then update the link in the checker warning to point to that wiki page.

After I’ve gotten that done, I’ll post another comment here to let everybody know.

@pchampin
Copy link
Author

pchampin commented Oct 4, 2022

Thanks @sideshowbarker and @phil-pickering for the constructive discussion.

@sideshowbarker wrote

And so that’s why the checker emits a warning: To help the author understand that, per-spec, using a solidus in a void-element start tag is a different case than using one in the start tag of an SVG or MathML.

Maybe they do understand, but have other valid reason to use it on void-elements (educational, or polyglot HTML as acutely pointed by @phil-pickering)... Having the validator assuming that I don't know what I'm doing is a bit offending :-)

@phil-pickering
Copy link

phil-pickering commented Oct 4, 2022

Sorry if I'm labouring my point here...

But every single one of the following web developers continues to use self-closing tags on their own websites.

I use the term "web developer" very loosely here as it hugely understates their contribution to the development and adoption of HTML/CSS and web standards 😄

@sideshowbarker
Copy link
Contributor

Update: In response to feedback here and elsewhere, I’ve made the following changes:

@pchampin
Copy link
Author

pchampin commented Oct 6, 2022

I can definitely live with that. Thanks @sideshowbarker.

@pchampin pchampin closed this as completed Oct 6, 2022
@pchampin
Copy link
Author

pchampin commented Oct 6, 2022

Ironically, a trailing slash used with an unquoted attribute does not raise a warning 😈
e.g.

<!DOCTYPE html>
<html lang=en>
<title>trailing slash strikes again</title>
<img src=test.png alt=test/>

@sideshowbarker
Copy link
Contributor

Ironically, a trailing slash used with an unquoted attribute does not raise a warning 😈

Yeah, I wish we could readily catch that — but catching it in the current checker architecture would be a lot more work. There are at least four separate conditions that would need to all be true in order to merit a warning:

  1. A start tag has an attribute that’s unquoted.
  2. That attribute is the last attribute in the element’s start tag.
  3. There is no space after the attribute and the closing angle bracket of the start tag.
  4. The attribute value has a trailing slash (or the start tag has a trailing slash — however one wants to look at it).

Of those four, the only condition that can be checked in the post-parsing document tree (essentially the equivalent of the DOM) built by the HTML parser is #4. The other conditions are lexical differences that would all need to be checked in the tokenizer code which first builds that tree.

It would still be possible to check those conditions in the tokenizer — but it would involve bolting on a significant amount of additional code to keep state information that the tokenizing and treebuilding code otherwise have no need for.

@sideshowbarker
Copy link
Contributor

I meant to also note here that the https://github.com/validator/validator/wiki/Markup-»-Void-elements page which is linked to in the checker message also has a Suppressing the “Trailing slash on void elements has no effect” message section that shows exactly how to persistently suppress/filter/ignore the message.

@phil-pickering
Copy link

Michael,

I think this is a very good compromise and I really appreciate the work you've put into it.

The wiki article is fascinating and has certainly made me realise that the use of a trailing slash for void elements is much more involved than simply a "stylistic" issue.

Many thanks again,

Phil

@sideshowbarker
Copy link
Contributor

@phil-pickering Cheers, and thanks for using and caring about the checker — and for helping make it better

@martindholmes
Copy link

@sideshowbarker I don't believe this is the correct solution for all cases. Those of us using XHTML5 (HTML5 which is well-formed XML) can't and shouldn't leave these tags unclosed, so the "INFO" provided by this message is wrong and misleading. I believe the message should not be shown at all for files which are well-formed XML and which bear the XHTML namespace.

There are very good reasons for continuing to use XHTML5, not least because such documents can be transformed with XSLT and queried with XQuery and other tools, so although I know there is some prejudice in the HTML community against XHTML these days, I don't think it's justified, and I don't think the validator should encourage the view that XHTML 5 is somehow not "good" HTML.

@sideshowbarker
Copy link
Contributor

@martindholmes Documents that are well-formed XML and designed to be transformable with XSLT and XQuery and other tools can be served with an XML MIME type, such as application/xhtml+xml.

In that case, browsers and the checker will behave like the other XML tools mentioned (XSLT, XQuery, etc., tools) — that is, they’ll parse the document using an XML parser rather than an HTML parser.

The checker only reports the “Trailing slash on void elements” message when a document is parsed using the HTML parser.

@martindholmes
Copy link

When you're using the validator to validate files on a local filesystem, though -- which should be the commonest use-case, because you want to validate before publishing files to a server -- there's no access to a mime type.

@sideshowbarker
Copy link
Contributor

@martindholmes

When you're using the validator to validate files on a local filesystem, though -- which should be the commonest use-case, because you want to validate before publishing files to a server -- there's no access to a mime type.

OK, thanks, yeah that use case makes sense — so I’ve added a new --xml option to the command-line checker; that option forces the XML parser to be used rather than the HTML parser.

Please try the latest at https://github.com/validator/validator/releases/download/latest/vnu.jar, and if it doesn’t work as expected, please file a new issue with details.

The command-line checker did already automatically always use the XML parser by default for files with .xhtml extensions. But I can see that’s too limiting, because once you actually serve the files over the web, you can configure your web server to send them with an application/xhtml+xml (or whatever) XML MIME type even if they have .html extensions — or really, for documents with whatever other extension you might want to use (or even for documents with no extension at all).

So with the --xml option, when checking some particular file, you can now get the same behavior from the command-line checker that you’d get if you have your web server configured to serve that file with an XML MIME type.

@martindholmes
Copy link

@sideshowbarker Works a treat. The only thing I'm getting now is this:

"Error: Bad value “Content-Type” for attribute “http-equiv” on element “meta”."

That's for this meta tag:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

which I guess you would say is actually wrong.

(I wish I could just produce HTML that happens to be well-formed XML in the XHTML namespace and have it be treated as HTML. Oh well.)

@sideshowbarker
Copy link
Contributor

The only thing I'm getting now is this:

"Error: Bad value “Content-Type” for attribute “http-equiv” on element “meta”."

That's for this meta tag:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

Yeah, the HTML standard explicitly prohibits http-equiv="content-type" in XML documents.

See https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-type

The Encoding declaration state may be used in HTML documents, but elements with an http-equiv attribute in that state must not be used in XML documents.

But what the standard does instead allow both in HTML documents and XML documents is <meta charset="utf-8"/> — though, like <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>, for documents served with an XML MIME type, it doesn’t actually have any effect of any kind in browsers (nor any other tools that use an XML parser).

See https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-charset

The charset attribute on the meta element has no effect in XML documents, but is allowed in XML documents in order to facilitate migration to and from XML.

And for HTML documents,<meta charset="utf-8"/> has exactly the same effect in browsers that
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> does.

@validator validator deleted a comment from martindholmes Oct 22, 2022
@martindholmes
Copy link

@sideshowbarker My testing suggests that serving these files as HTML with the HTML doctype and the <meta charset="utf-8"/> tag works well -- character encoding is honoured, whereas it's not if both the charset meta and the http-equiv meta are omitted, in which case it defaults to 8859-1as expected.

I've never understood why anyone would want to actively discourage XML well-formedness when it provides extra utility with no downside. When empty tags are not closed, the parser must understand the schema in order to interpret them; without knowledge of the content model, it can't determine whether it's an accidentally-unclosed container or not. That dependence on the schema is an additional burden that provides no benefit as far as I can see. And the advantages of having HTML that can be processed with XML tools is a huge benefit.

@Germs2004
Copy link

My $0.02 : turn off showing "Info" level results by default. Or, let filters be configurable by a URL parameter so I can disable this one with a bookmark.

Also, I suggest sorting the results so more important ones appear first. I had a bunch of these Infos surrounding 2 warnings, so I didn't know they were even there until I used the filter to hide the Infos.

@Heptazhou
Copy link

Couldn't agree more

I've never understood why anyone would want to actively discourage XML well-formedness when it provides extra utility with no downside. When empty tags are not closed, the parser must understand the schema in order to interpret them; without knowledge of the content model, it can't determine whether it's an accidentally-unclosed container or not. That dependence on the schema is an additional burden that provides no benefit as far as I can see. And the advantages of having HTML that can be processed with XML tools is a huge benefit.

thestinger added a commit to GrapheneOS/grapheneos.org that referenced this issue Jan 5, 2023
Until there's a new tagged release with the change resolving this issue:
validator/validator#1433.
thestinger added a commit to GrapheneOS/AttestationServer that referenced this issue Jan 8, 2023
Until there's a new tagged release with the change resolving this issue:
validator/validator#1433.
@oliviertassinari
Copy link

My $0.02 : turn off showing "Info" level results by default. Or, let filters be configurable by a URL parameter so I can disable this one with a bookmark. Also, I suggest sorting the results so more important ones appear first. I had a bunch of these Infos surrounding 2 warnings, so I didn't know they were even there until I used the filter to hide the Infos.

Yes, please:

  1. Hide the info level by default, this is noise:
Screenshot 2023-07-16 at 13 55 18
  1. Sort the results to show errors first, this is backward:
Screenshot 2023-07-16 at 13 55 50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants