Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling automatic spaces between CJK and Latin characters in LuaLaTeX #711

Open
BenjaminGalliot opened this issue Apr 23, 2024 · 1 comment

Comments

@BenjaminGalliot
Copy link

(Sorry for using English!)

Hello,

I'm currently working on an automatically generated document using LuaLaTeX. I've encountered an issue where there isn't an automatic space between CJK and Latin script characters, which affects the readability of mixed-language texts. For example:

中文
…français

The ellipsis (…) directly follows the Chinese characters without any space. I would like to have an automatic space inserted between CJK and Latin characters whenever a line break occurs between them.

Is there a known method or a recommended practice within LuaLaTeX environment to handle this spacing automatically? Any guidance or workaround to ensure proper spacing between these character sets would be greatly appreciated.

Thank you in advance for your help!

MWE:

\documentclass{article}
\RequirePackage[french]{babel}
\RequirePackage{ctex}
\babelprovide[import=zh-Hans]{cmn}
\setCJKfamilyfont{cmn}{AR PL UKai CN}
\setmainfont{EB Garamond}
\RenewDocumentCommand \CJKrmdefault {} {cmn}
\babelfont[french]{rm}{EB Garamond}
\NewDocumentCommand \scriptcjk {} {\ltjsetparameter{jacharrange={-1, +2, +3, -4, -5, +6, +7, -8, +9}}}
\NewDocumentCommand \scriptlatin {} {\ltjsetparameter{jacharrange={-1, -2, -3, -4, -5, +6, +7, -8, -9}}}
\NewDocumentCommand \tfra { m } {\foreignlanguage{french}{\scriptlatin#1}}
\NewDocumentCommand \tcmn { m } {\foreignlanguage{cmn}{\scriptcjk#1}}
\frenchsetup{og=«, fg=», AutoSpacePunctuation=true} % Can be turned off if necessary.
            
\begin{document}
\scriptlatin
\selectlanguage{french}

中文 …français  % Reference.

中文…français  % Expected.

中文
…français  % Not wanted.

中文\ 
…français  % Workaround, but I try to find better.

---------
% It should also work with commands around.

\tcmn{中文} \tfra{…français}

\tcmn{中文}\tfra{…français}

\tcmn{中文}
\tfra{…français}

\tcmn{中文}\
\tfra{…français}

---------
% Without punctuation.

\tcmn{中文} \tfra{français}

\tcmn{中文}\tfra{français}

\tcmn{中文}
\tfra{français}

\tcmn{中文}\
\tfra{français}

---------
% Various behaviours depending on punctuation?

中文 
\tfra{«français}

中文\ 
\tfra{«français}

中文 
\tfra{"français}

中文\ 
\tfra{"français}

中文 
\tfra{(français}

中文\ 
\tfra{(français}

中文 
\tfra{[français}

中文\ 
\tfra{[français}

中文 
\tfra{-français}

中文\ 
\tfra{-français}

中文 
\tfra{–français}

中文\ 
\tfra{–français}

中文 
\tfra{—français}

中文\ 
\tfra{—français}

\end{document}

Screenshot:
Screenshot_20240423_202707

My workaround of manually inserting spaces (\ ) is not ideal. I am looking for a more elegant solution that would automatically handle these spaces, particularly after a line break.

In addition to the main issue of spacing after line breaks, I've also noticed that the behavior changes depending on the punctuation used. Is it the intended behaviour? Is it possible to customize it?

Thank you very much.

@muzimuzhi
Copy link
Contributor

The reported behavior may be inherited from luatexja. I haven't checked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants