Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate lang #5186

Open
2 of 5 tasks
pickfire opened this issue Apr 5, 2023 · 3 comments
Open
2 of 5 tasks

Validate lang #5186

pickfire opened this issue Apr 5, 2023 · 3 comments

Comments

@pickfire
Copy link

pickfire commented Apr 5, 2023

I noticed https://www.ihcblog.com/http-framework-design-axum-as-an-example/ and even the examples shown in this repository uses zh-CN and zh-TW as seen in

* zh-CN => zh-cn
and other parts is not a valid option as mentioned in the specification.

This is not the first time I noticed this mistake in websites so I guess it is quite common, not using a valid lang recognized by the browser will cause the browser to render it in a different language other than the specified language, in my case it will cause the browser to use the incorrect font configured by the system (I configured arch linux to use a different chinese font for readability as in https://wiki.archlinuxcn.org/wiki/%E5%AD%97%E4%BD%93%E9%85%8D%E7%BD%AE/%E4%B8%AD%E6%96%87).

In the case of broken lang as mentioned in the above link.

image

When it is correct, it looks like this. (which wikemedia like what wikimedia/wikipedia has done)

image

image

Can refer to https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/lang and https://datatracker.ietf.org/doc/html/rfc5646. zh-CN should be either zh-Hans-CN or zh-CN and zh-TW should be zh-Hant-TW or zh-Hant. If validation is done, ideally these suggestions should be given for users that used the old incorrect values to the new correct values.

I did the same for japanese but I don't think it is an issue with ja since it does not have distinction between simplified zh-Hans and traditional language zh-Hant which chinese have.

Check List

Please check followings before submitting a new issue.

Expected behavior

hexo should prevent users from setting the invalid values.

Actual behavior

hexo just accepts whatever the user enters (most likely) since zh-CN works.

How to reproduce?

  • Set zh-CN (or even nani) for lang
  • It (probably) does not say if the lang is valid or not

Is the problem still there under "Safe mode"?

Not sure but I did not try.

Environment & Settings

I am not a hexo user.

Hexo and Plugin version(npm ls --depth 0)

Your package.json package.json

Others

@lorezyra
Copy link

Language codes are based on ISO-639-1 standard. HTML and CSS recognize these codes. However zh-Hans-CN and zh-Hant-TW are not recognized universally by HTML/CSS/JS...

When I wrote the read-time plugin for hexo (https://github.com/AsemAlhaidary/hexo-generator-readtime/), I did find codes for zh-Hans and zh-Hant.

I don't see the need to change the lang-code until the specification for ISO-639 is changed and recognized by all major browsers.

@jonassmedegaard
Copy link

ISO-639-1 indeed defines languages, but what is needed on the Web is more nuanced.

W3 describes it well here:

Content authors and webmasters also need to know how to use values for languages in a standard way. The current standard approach for W3C specifications is to use the rules expressed in BCP 47. This replaces earlier specifications such as RFC 3066 and RFC 1766, and goes beyond information available in the ISO language and country standards. You should also use the IANA Language Subtag Registry to look up language tags, rather than the ISO specifications.

(emphasis mine)

@lorezyra
Copy link

@jonassmedegaard,

You are correct that BCP 47 can be used. And technically, there's nothing stopping us from using that in our Hexo projects.

When I wrote the read-time plugin, I used the ISO standard as I didn't see it mentioned in the MDN docs. W3 promotes BCP, but not everyone recognizes that. I know that Google translate supports the BCP-47 specifications. And, there is some overlap between ISO-639 and BCP-47.

I've added aliases in my read-time plugin for support of other lang codes. And, it's trivial to add such support into Hexo (for your own theme).

I'm in the process of building a Hexo theme that supports at least 28 languages. It's designed for professional bloggers that want to give their audiences a custom experience similar to Twitter or Mastodon. I've spent the past 18 months building it and I don't feel it's complete. But feel free to visit the dev version of my theme: https://2022.blog.richiebartlett.com .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants