Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out-of-scope remap suggestions for JP/KR to follow faux-traditional forms by using TW/HK glyphs instead of CN glyphs #468

Open
Marcus98T opened this issue Oct 11, 2023 · 0 comments

Comments

@Marcus98T
Copy link

Marcus98T commented Oct 11, 2023

I know there’s an existing issue regarding this, but this issue is more than just requesting to use HK glyphs for JP, KR and CN, for which they only have a TW/HK glyph.

I’ve perused through the issues on the forks of Source Han Sans/Serif focusing on traditional orthography for Chinese (sometimes termed Inherited Glyphs, which I do mention in the tables for reference), and I have quite a lot of out-of-scope characters to suggest that they should at least follow the faux-traditional orthography of JP/KR by using TW/HK glyphs, which can look closer to the old traditional forms than CN glyphs.

I know that some characters may have radicals (e.g. 火) or components (e.g. 虍) which can look jarring for JP/KR because of the relatively ugly handwriting nature of the TW/HK glyphs, but unfortunately, because there is not enough glyph space, and that some characters only show TW/HK forms, I think there is no choice but to have those TW/HK components for JP/KR anyway because it’s more important that certain components follow hyōgaiji rules for Japan (and also follow the Korean K2 Unicode reference more closely) rather than to plainly use CN glyphs as a fallback (which can deviate from the so-called Kangxi forms). I even checked against v1 JP glyphs in Serif to make sure they are closely matched.

That is about most of the characters I could find in existence, however, I have not had the time to check component for component. I might add more characters in future edits to this issue page.

Also applies to Serif, which will have a separate set of tables soon.

However, I will like to note that there could have been unreleased/removed JP glyphs (those within Adobe-Japan1 designed by Adobe in-house, the extras designed by Iwata) for which they could have been safely used for TW/HK, instead of using TW/HK glyphs designed by Changzhou Sinotype (and later Arphic) as they are now. This I will also note in the tables.

And this issue does not take into account any possible major redesign of Sans, which means component merging, either by having JP and KR follow CN forms, or CN, TW and HK glyphs follow JP forms (which I prefer). If either scenario happens, then part of the tables may be invalid.

List of characters that would benefit from remapping

Unicode Character Current Mapping (JP) Current Mapping (KR) Remap to Notes (Reference only) Is there a JP glyph in Serif (removed v1 glyphs or otherwise) for which an unreleased JP Sans glyph could have been safely used for TW/HK?
U+3597 CN CN HK    
U+361A CN CN HK    
U+4B90 CN CN HK    
U+4F62 CN CN TW See this issue on Serif, also Serif is using the TW forms for JP and KR  
U+505D TW TW HK In issue #443  
U+50CB CN CN TW Serif is using the TW forms for JP and KR Yes, but the 人 component is different in Sans.
U+50E0 TW TW HK In issue #443  
U+51B9 CN CN TW Serif is using the TW forms for JP and KR  
U+537C CN CN TW Serif is using the TW forms for JP and KR  
U+539E CN CN TW   Yes, could have been used for TW/HK.
U+5526 CN CN HK    
U+555F CN CN TW In issue #458 Yes, but the 戶 component is different in Sans.
U+55D6 CN CN TW    
U+580E CN CN HK    
U+5814 CN CN TW    
U+5863 CN CN TW   Yes, could have been used for TW/HK.
U+58CF CN CN TW   Yes, could have been used for TW/HK.
U+58FE CN CN TW See this issue on Serif Yes, could have been used for TW/HK.
U+5DB5 CN CN TW    
U+5DB6 CN CN HK    
U+5F05 CN CN TW In issue #414 Yes, could have been used for TW.
U+609C CN CN TW   Yes, could have been used for TW/HK.
U+60FE CN CN HK    
U+618C CN CN TW   Yes, but the 金 component is different in Sans.
U+623A CN CN TW Related to issue #458 Yes, but the 戶 component is different in Sans.
U+6331 CN CN HK   Yes, could have been used for HK.
U+63D9 CN CN TW   Yes, but the 戶 component is different in Sans.
U+63DD JP JP TW (KR only?) The JP glyph is out-of-scope (not in Adobe-Japan1). Seems like the traditional forms prefer that the top part of the 昝 component is ⿺夂人 rather than ⿺夂卜, but the JP glyph shows ⿺夂卜 which is closer to the shape found in the Kangxi Dictionary. However, the Unicode Korean K2 reference shows ⿺夂人, so remap only for KR?
Screenshot 2023-10-12 at 02 32 40
Screenshot 2023-10-12 at 02 32 50
 
U+6718 CN CN HK    
U+6779 TW TW HK In issue #443  
U+6820 TW TW HK In issue #443  
U+68CE CN CN TW    
U+6944 CN CN TW   Yes, but the 戶 component is different in Sans.
U+6997 CN CN TW    
U+6A8C CN CN TW   Yes, could have been used for TW/HK.
U+6A93 CN CN TW    
U+6AA8 CN CN TW   Yes, could have been used for TW/HK, because it seems the 欠 component is joined at the bottom right for both CN and TW glyphs.
Screenshot 2023-10-12 at 02 49 15
U+6B76 TW TW HK In issue #443  
U+6C8E TW TW HK In issue #443  
U+6E9E CN CN TW   Yes, could have been used for TW/HK.
U+6EA6 CN CN TW    
U+6EBE CN CN TW    
U+6FBF 澿 TW TW HK In issue #443 and #368, also Serif is using the HK forms for JP and KR and CN  
U+7029 CN CN TW Serif is using the TW forms for JP and KR  
U+7128 TW TW HK In issue #443  
U+715F TW TW HK In issue #443, and also see this issue in Serif  
U+7181 TW TW HK In issue #443  
U+7182 TW TW HK In issue #443  
U+7203 TW TW HK In issue #443  
U+752E CN CN HK Unsure for JP. The Unicode Korean K2 reference shows a form closer to HK.
Screenshot 2023-10-12 at 02 33 09
 
U+7705 CN CN TW   Yes, could have been used for TW.
U+7743 CN CN HK    
U+7759 CN CN TW   Yes, but the 戶 component is different in Sans.
U+7828 CN CN TW   Yes, but the 戶 component is different in Sans.
U+78A5 CN CN TW   Yes, but the 戶 component is different in Sans.
U+7A04 CN CN HK    
U+7A28 CN CN TW   Yes, but the 戶 component is different in Sans.
U+7A68 CN CN TW    
U+7BFB TW TW HK In issue #443  
U+7CD0 TW TW HK In issue #443  
U+7F92 CN CN TW In issue #414  
U+7FA7 CN CN HK    
U+801B CN CN TW Serif is using the TW forms for JP and KR  
U+8029 CN CN TW Serif is using the TW forms for JP and KR  
U+815C TW TW HK In issue #443  
U+8215 TW TW HK In issue #443  
U+86C2 CN CN TW   Yes, could have been used for TW/HK.
U+8727 CN CN TW   Yes, but the 戶 component is different in Sans.
U+8746 CN CN TW? Unsure, seems like the v1 JP glyph in Serif showed ⿰虫𦬒 (as appeared in Guangyun (廣韻)), however, a similar variant exists as U+45B9 (䖹), but with the 卝 component on the top right rather than 艹.
Screenshot 2023-10-13 at 18 30 57
Screenshot 2023-10-12 at 02 33 46
 
U+876C CN CN HK    
U+8779 CN CN TW Serif is using the TW forms for JP and KR  
U+87A5 CN CN HK   Yes, but the 人 component is different in Sans.
U+87A9 CN CN HK    
U+87B6 CN CN TW See this issue on Serif  
U+89B9 CN CN TW Serif is using the TW forms for JP and KR  
U+89D3 CN CN TW Serif is using the TW forms for JP and KR  
U+8AF2 CN TW TW (JP only) The JP glyph in Serif is similar to the TW glyph in Sans (right side only), so might as well remap to TW for JP.
Screenshot 2023-10-12 at 03 26 52
 
U+8C43 CN TW TW (JP only)   Yes, but the 人 component is different in Sans.
U+8C5F CN CN TW   Yes, but the 戶 component is different in Sans.
U+8C71 CN CN TW Serif is using the TW forms for JP and KR  
U+8C7D CN CN TW Serif is using the TW forms for JP and KR Yes, could have been used for TW/HK.
U+8CBE CN CN TW Serif is using the TW forms for JP and KR  
U+8DBF 趿 CN TW TW (JP only)   Yes, could have been used for TW.
U+8E02 CN CN TW Serif is using the TW forms for JP and KR  
U+8E1C CN CN HK    
U+8EB9 CN CN HK   Yes, could have been used for TW/HK.
U+8EE1 CN CN TW Serif is using the TW forms for JP and KR Yes, but the 今 component is different in Sans.
U+8EE7 CN CN TW Serif is using the TW forms for JP and KR  
U+8EF6 CN TW TW (JP only)   Yes, but the 戶 component is different in Sans.
U+910B CN CN TW    
U+9111 CN CN TW Serif is using the TW forms for JP and KR  
U+9287 CN CN TW   Yes, but the 金 component is different in Sans.
U+9313 CN CN TW   Yes, but the 金 component is different in Sans.
U+9389 CN TW TW (JP only)    
U+93AA CN CN TW   Yes, probably, but the 金 component is different in Sans.
U+93CF CN CN TW Serif is using the TW forms for JP and KR Yes, but the 金 component is different in Sans.
U+95B7 CN CN TW    
U+966B CN CN TW   Yes, could have been used for TW/HK.
U+981B CN CN TW   Yes, could have been used for TW/HK.
U+985D TW TW HK Serif is using the HK forms for JP and KR Yes, could have been used for HK.
U+99A7 CN CN TW   Yes, could have been used for TW.
U+99CF CN CN TW See this issue on Serif  
U+9A48 CN CN TW Serif is using the TW forms for JP and KR Yes, probably could have been used for TW/HK.
U+9A68 CN CN TW Serif is using the TW forms for JP and KR  
U+9AAB TW TW HK Serif is using the HK forms for JP and KR Yes, could have been used for HK.
U+9AB1 CN CN HK Serif is using the HK forms for JP and KR Yes, but the 人 component is different in Sans.
U+9AB3 TW TW HK Serif is using the HK forms for JP and KR Yes, could have been used for HK.
U+9ABA CN CN HK   Yes, could have been used for HK.
U+9AC7 CN CN HK    
U+9ACA TW TW HK   Yes, could have been used for HK.
U+9ACD CN CN HK While TW 麻 is closer to traditional forms, the 骨 radical looks jarring. HK would look better for stability.  
U+9B3E CN CN TW    
U+9B3F 鬿 CN CN TW    
U+9B40 CN CN HK    
U+9B4A CN CN TW    
U+9B59 CN CN TW Serif is using the TW forms for JP and KR  
U+9BDE CN TW TW (JP only) Serif is using the TW forms for JP and KR  
U+9BE6 CN CN TW Serif is using the TW forms for JP and KR  
U+9D5A CN CN TW Serif is using the TW forms for JP and KR  
U+9DA3 CN CN TW   Yes, but the 戶 component is different in Sans.
U+9DEC CN CN HK Serif is using the HK forms for JP and KR Yes, could have been used for HK.
U+9EB1 CN CN HK? Seems like the JP glyph in Serif is close to the CN form in Sans. Turns out the JP glyph does not normally use the ⿺ composition for the 麥 radical for 麱 (U+9EB1), instead choosing to use the ⿰ composition as seen in the Kangxi Dictionary. KIV for now.
Screenshot 2023-10-12 at 03 07 38
Screenshot 2023-10-12 at 03 07 47
Screenshot 2023-10-12 at 03 11 56
 
U+9EC7 CN CN HK   Yes, could have been used for HK.
U+9EE2 CN CN HK    
U+9EEB CN CN TW    
U+9F74 CN CN TW Serif is using the TW forms for JP and KR  

List of characters that would show TW/HK educational forms inappropriate for JP/KR traditional forms after remapping

Unicode Character Current Mapping (JP) Current Mapping (KR) Remap to Notes about the Taiwan Ministry of Education (MOE)/HK Educational forms that are incompatible with traditional orthography Why remap anyway?
U+3D32 CN CN HK The lack of a hook in the middle 七 part of the 虎 component. It's better to have 儿 for the bottom part rather than 几 for the 虎 component for traditional forms.
U+44DF CN CN HK (JP only) The split 艹 component, while acknowledged in Inherited Glyphs, is not normally seen in Japanese and Korean fonts, instead preferring 艹. Japanese traditional forms usually have the upper-middle component connect with the lower-middle 口 component.
U+4504 CN CN HK The split 艹 component, while acknowledged in Inherited Glyphs, is not normally seen in Japanese and Korean fonts, instead preferring 艹. Traditional forms usually go for ⿱𱼀缶, not ⿱爫缶 as per PRC conventions.
U+4561 CN HK HK (JP only) The split 艹 component, while acknowledged in Inherited Glyphs, is not normally seen in Japanese and Korean fonts, instead preferring 艹. It's better to show 呂 rather than simplified 吕 for traditional forms. Also, if the KR locale has the HK glyph, then JP should also remap to the HK glyph.
U+4B19 CN CN HK There is a horizontal stroke rather than a left throw stroke in 風. While Inherited Glyphs recommend the horizontal stroke, Japanese and Korean fonts prefer the left throw stroke. Traditional forms usually go for ⿱𱼀缶, not ⿱爫缶 as per PRC conventions.
U+5061 CN CN TW The bottom 匸 component is ⿱一㇄ rather than ⿱一𠃊 preferred by Japanese and Korean fonts. It's better to have 儿 rather than 八 in the bottom of the 甚 component for traditional forms.
U+595C CN TW TW (JP only) The bottom 大 component is a drop stroke rather than a throw. It's better to have the top 非 component follow the faux-traditional orthography. Also, if the KR locale has the TW glyph, then JP should also remap to the TW glyph.
U+5AF9 CN CN HK The left 女 radical looks jarring. It's better to have traditional 黃 rather than simplified 黄 as preferred in PRC conventions.
U+5CFC CN CN HK (JP only) The 牛 touches the 口 in 吿 (referring to HK educational forms), whereas Japanese kyūjitai forms do not have two parts touch. It's better to show 吿 rather than 告 for Japanese traditional forms. Does not apply to the KR locale.
U+640B CN CN TW The lack of a hook in the middle 七 part of the 虎 component. It's better to have 儿 for the bottom part rather than 几 for the 虎 component for traditional forms.
U+64D9 CN CN HK The bottom-right 大 component is a drop stroke rather than a throw. It's better to show 奧 for traditional forms rather than the simplified 奥 as per PRC conventions.
U+64E8 CN CN TW The lack of a hook in the middle 七 part of the 虎 component. It's better to have 儿 for the bottom part rather than 几 for the 虎 component for traditional forms.
U+6825 CN CN TW The bottom 木 component is showing ホ. Also, slight difference in the way 次 is being written between TW and KR forms. On the left side, TW has a straight 二, while the second stroke in KR is a pickup stroke. It's still better for 次 to follow traditional forms of ⿰二欠 rather than ⿰冫欠 as per PRC conventions.
U+6902 CN CN HK The way 氺 in 彔 is being written. In Japanese and Korean fonts, it ends with a throw stroke, not a drop stroke. It's better to show 彔 rather than simplified 录 for traditional forms.
U+6976 CN CN TW The bottom 木 component is showing ホ. Also, slight difference in the way 次 is being written between TW and KR forms. On the left side, TW has a straight 二, while the second stroke in KR is a pickup stroke. It's still better for 次 to follow traditional forms of ⿰二欠 rather than ⿰冫欠 as per PRC conventions.
U+69B9 CN CN TW The lack of a hook in the middle 七 part of the 虎 component. It's better to have 儿 for the bottom part rather than 几 for the 虎 component for traditional forms.
U+6C2F CN CN TW The way 氺 in 彔 is being written. In Japanese and Korean fonts, it ends with a throw stroke, not a drop stroke. It's better to show 彔 rather than simplified 录 for traditional forms.
U+718E CN CN HK The left 火 radical looks jarring. Traditional forms usually go for ⿱𱼀缶, not ⿱爫缶 as per PRC conventions.
U+7769 CN CN TW The way 氺 in 彔 is being written. In Japanese and Korean fonts, it ends with a throw stroke, not a drop stroke. It's better to show 彔 rather than simplified 录 for traditional forms.
U+7BCE CN CN HK The disconnected ⺮ component, however, it might be unified in the next major version as per issue #398 The 少 component must have a hook as per traditional forms.
U+7BEB CN CN TW The disconnected ⺮ component, however, it might be unified in the next major version as per issue #398 Traditional forms usually go for ⿰工卂, not ⿰工凡 as per PRC conventions.
U+7BEC CN CN HK The disconnected ⺮ component, however, it might be unified in the next major version as per issue #398 It's better to show 亼 rather than 亽 for the 倉 component for traditional forms.
U+7C05 CN CN TW The disconnected ⺮ component, however, it might be unified in the next major version as per issue #398 It's better to have tradtional 產 rather than simplified 産 (which is actually Japanese shinjitai) as per PRC conventions.
U+7C48 CN CN TW The disconnected ⺮ component, however, it might be unified in the next major version as per issue #398 Traditional forms usually show ⿱西土 rather than ⿱覀土 for the 垔 component.
U+8623 CN CN HK The split 艹 component, while acknowledged in Inherited Glyphs, is not normally seen in Japanese and Korean fonts, instead preferring 艹. It's better to have traditional 黃 rather than simplified 黄 as preferred in PRC conventions.
U+8628 CN CN HK The split 艹 component, while acknowledged in Inherited Glyphs, is not normally seen in Japanese and Korean fonts, instead preferring 艹. Also note the unbalanced 系 component. Traditional forms usually go for ⿱𱼀缶, not ⿱爫缶 as per PRC conventions.
U+8633 CN CN HK The split 艹 component, while acknowledged in Inherited Glyphs, is not normally seen in Japanese and Korean fonts, instead preferring 艹. It's better to have traditional 黃 rather than simplified 黄 as preferred in PRC conventions.
U+8742 CN CN TW The 片 component, rendered as ⿰丿⿱𠃎𠃍  in the Taiwan MOE forms rather than the ⿰丿⿱丄𠃍 preferred by Japanese and Korean fonts. It's better to show ⿸厂又 rather than ⿸𠂆又 for traditional forms.
U+8794 CN CN TW The lack of a hook in the middle 七 part of the 虎 component. It's better to have 儿 for the bottom part rather than 几 for the 虎 component for traditional forms.
U+8800 CN CN TW Slight difference in the way 次 is being written between TW and KR forms. On the left side, TW has a straight 二, while the second stroke in KR is a pickup stroke. It's better for 次 to follow traditional forms of ⿰二欠 rather than ⿰冫欠 as per PRC conventions.
U+89A4 CN CN TW The lack of a hook in the middle 七 part of the 虎 component. It's better to have 儿 for the bottom part rather than 几 for the 虎 component for traditional forms.
U+8D19 CN CN TW The lack of a hook in the middle 七 part of the 虎 component. It's better to have 儿 for the bottom part rather than 几 for the 虎 component for traditional forms.
U+9199 CN CN TW The bottom inner horizontal stroke is disconnected from the enclosure in the left 酉 radical. Traditional forms prefer a disconnected 臼 component in 叟.
U+942D CN CN HK The bottom-right 大 component is a drop stroke rather than a throw. It's better to show 奧 for traditional forms rather than the simplified 奥 as per PRC conventions.
U+9C4B CN CN TW The lack of a hook in the middle 七 part of the 虛 component. Traditional forms prefer 虛 rather than simplified 虚 as per PRC conventions.
U+9F2D CN CN HK Japanese standards are inconsistent in their dealing with the 鼠 component, most of the time it's ⿱臼⿲𠄌⺀⿲𠄌⺀㇂, but sometimes it's ⿱臼⿲𠄌二⿲𠄌二㇂. Taiwan MOE forms, however, prefer ⿱臼⿲𠄌二⿲𠄌二㇂. Traditional forms prefer a ⿺ composition for the 鼠 component rather than the ⿰ composition as per PRC conventions.

Miscellaneous remappings from other issues

夗 although kind of out of scope, can map the TW glyph to JP/KR for the 夕 component.
image

Originally posted by @/NightFurySL2001 in #443 (comment)


As far as I know, there are two characters currently using TW/HK forms in Sans for JP and KR, 觺 (U+89FA) and 觾 (U+89FE), for which a separate CN glyph exists for both.

Screenshot 2023-10-12 at 02 44 51

Because of this, we should remap more glyphs to TW/HK forms for JP and KR, so as to make them look a bit closer to traditional forms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant