Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Markdown: Do not insert spaces between Chinese/Japanese & latin lette…
- Loading branch information
Showing
6 changed files
with
143 additions
and
45 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
#### No more inserting space between Chinese or Japanese (e.g. hanzi and kana) and western characters (#11597 by @tats-u) | ||
|
||
<!-- Optional description if it makes sense. --> | ||
|
||
The current behavior of inserting whitespace (U+0020) between Chinese or Japanese (e.g. hanzi / kanji and kana) and western (e.g. alphanumerics) characters is not based on the official layout guidelines in Japanese and Chinese but [non-standard and local one in Chinese](https://github.com/ruanyf/document-style-guide/blob/master/docs/text.md). | ||
|
||
Official Japanese guideline (W3C): | ||
|
||
> 3.9.1 Differences in Positioning of Characters and Symbols | ||
> | ||
> The positioning of characters and symbols may vary depending on the following. | ||
> | ||
> d. Are characters and symbols appearing in sequence in solid setting, or will there be a fixed size space between them? For example, sequences of ideographic characters (cl-19) and hiragana (cl-15) are set solid, and for Western characters (cl-27) following hiragana (cl-15) there will be quarter em spacing. | ||
<https://www.w3.org/TR/jlreq/#differences_in_positioning_of_characters_and_symbols> | ||
|
||
> “one quarter em” means one quarter of the full-width size. (JIS Z 8125) | ||
> “one quarter em space” means amount of space that is one quarter size of em space. | ||
<https://www.w3.org/TR/jlreq/#term.quarter-em> | ||
<https://www.w3.org/TR/jlreq/#term.quarter-em-space> | ||
|
||
Official Japanese guideline (JIS X 4051:2004): | ||
|
||
> 4.7 和欧文混植処理 | ||
> | ||
> a) 横書きでは,和文と欧文との間の空き量は,四分アキを原則とする。 | ||
> | ||
> 4.7 Mixed Japanese and Western Text Composition | ||
> | ||
> a) In horizontal writing, the space between Japanese and western text should be one quarter em, as a rule. | ||
> | ||
> Note: Original text is written only in Japanese and translation is based on [DeepL](https://www.deepl.com/translator). | ||
<https://kikakurui.com/x4/X4051-2004-02.html> (Japanese) | ||
|
||
Official Chinese guideline (W3C): | ||
|
||
> 3.2.2 Mixed Text Composition in Horizontal Writing Mode | ||
> | ||
> In principle, there is tracking or spacing between an adjacent Han character and a Western character of up to one quarter of a Han character width, except at the line start or end. | ||
> Another approach is to use a Western word space (U+0020 SPACE), in which case the width depends on the font in use. | ||
<https://www.w3.org/TR/clreq/#mixed_text_composition_in_horizontal_writing_mode> | ||
|
||
Whitespace (U+0020) is allowed to be substituted for one quarter em only in Chinese, although they have a similar appearance. Also, even in Chinese, the rule is not adopted even in the W3C guideline page but mentioned as just one of options. | ||
|
||
Some renderers (e.g. convert to PDF using Pandoc with the backend of LaTeX) can automatically insert genuine one quarter em. The width of whitespace is different from one quarter em, so inserting whitespace (U+0020) takes away the option to leave it to renderers to insert one quarter em. Adding space should be left to renderes and should not be done by Prettier. | ||
|
||
Adding whitespace may interfere with searches for text containing both Chinese or Japanese and western characters. For example, you cannot find “第1章” (Chapter 1) in a Markdown document or its derivative just by searching by the string “第1章”. | ||
|
||
To make matters worst, once whitespace is inserted, it is difficult to remove it. The following sentence cannot be said to be wrong. | ||
|
||
> 作る means make in Japanese. | ||
The too simple rule of removing whitespace between Chinese or Japanese characters and alphanumerics removes that between “作る” and “means” unless you modify the sentence, that is, quote “作る”. It is so difficult to create a common rule that can safely remove whitespace from all documents and deserves to be included in Prettier. | ||
|
||
In conclusion, the imposion of the non-standard rule by just a formatter must be ended. | ||
|
||
<!-- prettier-ignore --> | ||
```markdown | ||
<!-- Input --> | ||
漢字Alphabetsひらがな12345カタカナ67890한글 | ||
|
||
<!-- Prettier stable --> | ||
漢字 Alphabets ひらがな 12345 カタカナ 67890 한글 | ||
|
||
<!-- Prettier main --> | ||
漢字Alphabetsひらがな12345カタカナ67890한글 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.