Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support OpTeX #215

Open
Witiko opened this issue Nov 12, 2022 · 12 comments · Fixed by #291
Open

Support OpTeX #215

Witiko opened this issue Nov 12, 2022 · 12 comments · Fixed by #291
Labels
feature request help wanted optex Related to the OpTeX interface and implementation plaintex Related to the plain TeX interface and implementation

Comments

@Witiko
Copy link
Owner

Witiko commented Nov 12, 2022

The Markdown package currently supports plain TeX, LaTeX, and ConTeXt. We should add support for further formats such as Petr @olsak's OpTeX.

@Witiko Witiko added feature request help wanted good first issue plaintex Related to the plain TeX interface and implementation labels Nov 12, 2022
@Witiko Witiko added this to the 3.0.0 milestone Nov 12, 2022
@Witiko
Copy link
Owner Author

Witiko commented Nov 12, 2022

@olsak I can handle tasks 2–4, but I would appreciate help with task 1 (Define renderers for OpTeX).

In the user manual, there is a list of all renderers in the Markdown package. We have some definitions for plain TeX (see markdown.tex) that will be inherited by OpTeX but anything more complicated (citations, tables, headings, …) needs to be defined for each format separately (see definitions for LaTeX in markdown.sty and ConTeXt in t-markdown.tex).

If I can get a list of definitions such as the following, I can take it from there.

\def\markdownRendererHeadingOnePrototype#1{\tit #1^^M}%
\def\markdownRendererHeadingTwoPrototype#1{\chap #1^^M}%
\def\markdownRendererHeadingThreePrototype#1{\sec #1^^M}%
% ...

With correct definitions, we should be able to typeset the document example.md in OpTeX as follows:

\input markdown

% Options
\def\markdownOptionContentBlocks{true}%
\def\markdownOptionDebugExtensions{true}%
\def\markdownOptionDefinitionLists{true}%
\def\markdownOptionFancyLists{true}%
\def\markdownOptionHashEnumerators{true}%
\def\markdownOptionInlineNotes{true}%
\def\markdownOptionNotes{true}%
\def\markdownOptionPipeTables{true}%
\def\markdownOptionRawAttributes{true}%
\def\markdownOptionSmartEllipses{true}%
\def\markdownOptionStrikeThrough{true}%
\def\markdownOptionSubscripts{true}%
\def\markdownOptionSuperscripts{true}%
\def\markdownOptionTableCaptions{true}%
\def\markdownOptionTaskLists{true}%

% Renderer prototypes
\def\markdownRendererHeadingOnePrototype#1{\tit #1^^M}%
\def\markdownRendererHeadingTwoPrototype#1{\chap #1^^M}%
\def\markdownRendererHeadingThreePrototype#1{\sec #1^^M}%
% ...

% Example document
\fontfam[LMfonts]
\markdownInput{example.md}
\bye

@Witiko Witiko added the optex Related to the OpTeX interface and implementation label Nov 12, 2022
@Witiko Witiko changed the title Add support for OpTeX Support OpTeX Nov 12, 2022
@Witiko Witiko changed the title Support OpTeX Support OPTeX Nov 12, 2022
@olsak
Copy link

olsak commented Nov 13, 2022

I mean that \def\markdownRendererHeadingTwoPrototype#1{\chap #1^^M} will not work because ^^M is TeX-unfriendly character. OpTeX scans the line after \chap, \sec etc to its end in verbatim mode, i.e things like "verbatim in title" are read as they are and the parameter is tokenized later when it is needed. So, user can write something like

\sec We talk about the `{` character

and it will work in titles, outlines, tables of contents etc. (Something similar is impossible in LaTeX.). But we can throw away this feature of scanning to the end of line and use internal OpTeX macros for markdown package, i.e.

\def\markdownRendererHeadingOnePrototype#1{\_printtit{#1}}
\def\markdownRendererHeadingTwoPrototype#1{\_inchap{#1}}
\def\markdownRendererHeadingThreePrototype#1{\_insec{#1}}
\def\markdownRendererHeadingFourPrototype#1{\_insecc{#1}}

I'll take look to another prototypes...

@Witiko Witiko changed the title Support OPTeX Support OpTeX Nov 13, 2022
@olsak
Copy link

olsak commented Dec 19, 2022

I tried to \input markdown in OpTeX but there are problems. It does \input expl3-generic and this macro file cannot be loaded in OpTeX without additional tricks. I hope that we need not to load this macro file in OpTeX, but I can mention the tricks.

  1. expl3-generic loads a lua code which expects luatexbase.new_luafunction is defined. But OpTeX has its own luatexbase and tis function isn't defined in it. So, we need
\directlua{
   function luatexbase.new_luafunction(name)
      return \string#lua.get_functions_table() + 1
   end }
  1. \foo_ macros are inaccessible in OpTeX by default, you need run \mathsboff before \input expl3-generic
  2. expl3-generic expects that \newread and \newwrite are defined as in plain TeX but it is not true. We need
\def\newread  {\_newread}
\def\newwrite {\_newwrite}
  1. expl3-generic needs the control sequence \e@alloc@ccodetable@count.
\slet{e@alloc@ccodetable@count}{_catcodetablealloc}

Sumarry: the following code must be inserted before \input markdown:

\directlua{
   function luatexbase.new_luafunction(name)
      return \string#lua.get_functions_table() + 1
   end }
\mathsboff
\def\newread  {\_newread}
\def\newwrite {\_newwrite}
\slet{e@alloc@ccodetable@count}{_catcodetablealloc}

\input markdown

\markdownBegin
Hello *world*!
\markdownEnd

\bye

@Witiko
Copy link
Owner Author

Witiko commented Dec 20, 2022

I hope that we need not to load this macro file in OpTeX, but I can mention the tricks.

The Markdown package exposes a Lua module, so it is definitely possible to make the OpTeX layer sit directly on top of the Lua layer unlike the ConTeXt and LaTeX layers, which sit on top of the Plain TeX + expl3 layer:

block-diagram-optex

This would honor the spirit of the OpTeX format (minimalistic, fast, no-nonsense). It would reduce code reuse, but that may be a good thing, because it allows us to reimagine e.g. the option-passing from TeX to Lua. Here is how you could use the Markdown package via Lua in OpTeX:

\directlua{
  local ran_ok, kpse = pcall(require, "kpse")
  if ran_ok then kpse.set_program_name("luatex") end
  local convert = require("markdown").new()

  function markdown(input)
    local output = convert(input)
    tex.print(output)
  end
}

\let\markdownRendererDocumentBegin\relax
\def\markdownRendererEmphasis#1{{\em #1}}
\let\markdownRendererDocumentEnd\relax

\directlua{ markdown"Hello *world*!" }

\bye

@Witiko
Copy link
Owner Author

Witiko commented Dec 20, 2022

I hope that we need not to load this macro file in OpTeX, but I can mention the tricks.

This is useful to know and we may want to forward this to https://github.com/latex3/latex3, so that expl3-generic can be loaded in OpTeX without hassle.

@olsak
Copy link

olsak commented Dec 20, 2022

It is amazing! If there is a list of all control sequences generated by the convert Lua function (like \markdownRendererEmphasis in your example) and they are documented ,then I can prepare macros for each such control sequence. Moreover, if there is a need to set parameters in key=value format then it can be done at Lua level or at macro level using simple OpTeX macros \kv and \kvscan, see section 2.9 in OpTeX documentation.

@Witiko
Copy link
Owner Author

Witiko commented Dec 21, 2022

All \markdownRenderer... control sequences (also known as renderers) are listed and defined in the user manual. Furthermore, we can also use the internal reflection API exposed by the plain TeX layer of the Markdown package to list all renderers and the number of parameters that they accept:

% tricks needed to load expl3-generic.tex package
\directlua{
   function luatexbase.new_luafunction(name)
      return \string#lua.get_functions_table() + 1
   end }
\mathsboff
\def\newread  {\_newread}
\def\newwrite {\_newwrite}
\slet{e@alloc@ccodetable@count}{_catcodetablealloc}

\input markdown

% list all renderers and the number of parameters that they accept
\begitems
\ExplSyntaxOn
\seq_map_inline:Nn
  \g__markdown_renderers_seq
  {
    \tl_set:Nn
      \l_tmpa_tl
      { #1 }
    \regex_replace_once:nnN
      { ^. }
      { \c { bslash } markdownRenderer \c { str_uppercase:n } \cB\{ \0 \cE\} }
      \l_tmpa_tl
    \prop_get:NnN
      \g__markdown_renderer_arities_prop
      { #1 }
      \l_tmpb_tl
    * \l_tmpa_tl{}~(accepts~\l_tmpb_tl{}~parameters)
  }
\ExplSyntaxOff
\enditems

\bye

Here is the output of running OpTeX on the above source code with the current source code from branch main:

renderers-4bdde29

Most renderers have a fixed number of parameters and have an obvious default definition, such as \markdownHeadingOne, which has one parameter and corresponds to a top-level heading. However, there are some exceptions:

  • Some renderers have a variable number of paramerers, such as \markdownRendererCite, \markdownRendererTextCite, and \markdownRendererTable, where the number of parameters depends on the number of citations or table rows/columns.
  • Some other renderers have no obvious default definition and should be defined by users, such as \markdownRendererDocumentBegin, \markdownRendererAttributeClassName, and \markdownRendererJekyllDataNumber.
  • Other renderers have been deprecated and are no longer listed in the user manual or produced by the Markdown package, such as \markdownRendererFootnote and \markdownRendererHorizontalRule (see also Rename renderers based on the semantics of elements #187).

@olsak
Copy link

olsak commented Jan 12, 2023

I tried first attempt to set OpTeX macros with markdown package. But I don't understand many things, the % ?? is here. For example, why there is UlBeginTight , UlEndTight. I see only single behavior of lists in Markdown documentation. Moreover, the \markdownRendererLink cannot give the raw URI in #3 if it includes %, for example.

\fontfam[lm]
\hyperlinks\Blue\Blue

\_directlua{
  local ran_ok, kpse = pcall(require, "kpse")
  if ran_ok then kpse.set_program_name("luatex") end
}

\_eoldef \markdownBegin #1{% #1 includes the end of the current line, parameters can be here
   \_def\_markdownParams{#1}%
   \_bgroup \_setverb \_savemathsb \_endlinechar=`\^^J
   \_markdownBeginA
}
\_ea\_def \_ea\_markdownBeginA \_ea#\_ea1\_csstring\\markdownEnd#2^^J{%
   \_restoremathsb \_egroup 
   \_directlua{
      local convert = require("markdown").new({\_markdownParams})
      tex.print(convert("\_luaescapestring{#1}"))}%
}

\_edef \markdownRendererAmpersand   #1{\_csstring\&}
\_edef \markdownRendererBackslash   #1{\_csstring\\}
\_edef \markdownRendererCircumflex  #1{\_csstring\^}
\_edef \markdownRendererDollarSign  #1{\_csstring\$}
\_edef \markdownRendererHash        #1{\_csstring\#}
\_edef \markdownRendererLeftBrace   #1{\_csstring\{}
\_edef \markdownRendererPercentSign #1{\_csstring\%}
\_edef \markdownRendererPipe        #1{|}            % ??
\_edef \markdownRendererRightBrace  #1{\_csstring\}}
\_edef \markdownRendererTilde       #1{\_csstring\~}
\_edef \markdownRendererUnderscore  #1{_}            % ??

\_def\markdownRendererLink  #1#2#3#4{\_ea\_ulink\_ea[\_expanded{#3}]{#1}} % ?? raw URI? doesn't work with hybrid=true

\_def \markdownRendererAttributeIdentifier #1{} % ??
\_def \markdownRendererAttributeClassName  #1{} % ??
\_def \markdownRendererAttributeKeyValue   #1#2{} % ??

\_def \markdownOptionTightLists     {true}
\_def \markdownRendererUlBegin      {\_begitems}
\_def \markdownRendererUlBeginTight {\_begitems \_novspaces} % ??
\_def \markdownRendererUlEnd        {\_enditems}
\_def \markdownRendererUlEndTight   {\_enditems}
\_def \markdownRendererUlItem       {\_startitem} 
\_def \markdownRendererUlItemEnd    {\_par}

\_def \markdownRendererInterblockSeparator {\_par} % ??

\_def \markdownRendererInputVerbatim    #1{\_verbinput (-) {#1} }
\_def \markdownRendererInputFencedCode  #1#2{\_verbinput (-) {#1} } % ??

\_def \markdownRendererCodeSpan         #1{#1} % ??

\_def \markdownOptionContentBlocks             {true} % ??
\_def \markdownRendererContentBlock            #1#2#3#4{This is {\_tt #2}, #4.} % ??
\_def \markdownRendererContentBlockOnlineImage #1#2#3#4{This is the image {\tt #2}, #4.} % ??
\_def \markdownRendererContentBlockCode        #1#2#3#4#5{% ??
      This is the #2 (\_uppercase{#1}) document {\_tt #3}, #5.%
}

\_let \markdownRendererDocumentBegin   \_relax
\_let \markdownRendererDocumentEnd     \_relax
\_def \markdownRendererBlockQuoteBegin {\_begblock}
\_def \markdownRendererBlockQuoteEnd   {\_endblock}

\_def \markdownRendererEmphasis        #1{{\_em #1}}

Tests:

\markdownBegin hybrid=true
This is a list *without* vertical spaces above and below:

- the first item
  at more lines
- the second item: $\sum_k^n x\_k=b$
- the third item

Next paragraph.

> This is a block of text
> in more lines.

Final paragraph.
\markdownEnd

\bye

@Witiko
Copy link
Owner Author

Witiko commented Jan 13, 2023

@olsak For example, why there is UlBeginTight , UlEndTight. I see only single behavior of lists in Markdown documentation.

See the documentation of option tightLists:

Unordered and ordered lists whose items do not consist of multiple paragraphs will be considered tight. Tight lists will produce tight renderers that may produce different output than lists that are not tight:

- This is
- a tight
- unordered list.

- This is

  not a tight

- unordered list.

See also the documentation of bullet item renderers:

The \markdownRendererUlBegin macro represents the beginning of a bulleted list that contains an item with several paragraphs of text (the list is not tight). The macro receives no arguments.

The \markdownRendererUlBeginTight macro represents the beginning of a bulleted list that contains no item with several paragraphs of text (the list is tight). This macro will only be produced, when the tightLists option is disabled. The macro receives no arguments.

See also the CommonMark spec.

Since non-tight lists contain bullet items with multiple paragraphs, it may be a good idea to add vertical spaces not just around the list but also between the individual items. Here is how the above example is rendered in LaTeX by default:

scrot

@Witiko
Copy link
Owner Author

Witiko commented Jan 13, 2023

@olsak Moreover, the \markdownRendererLink cannot give the raw URI in #3 if it includes %, for example.

The plain TeX layer changes the catcodes of % and # to other (12) inside \markdownBegin ... \markdownEnd. The main reason is that allowing %-comments with option hybrid=true produces unintuitive results, since the the markdown parser does not preserve newlines during conversion, see e.g. Section 2.2 in our TUGboat 42:2 article. Furthermore, both % and # are commonly featured in URLs and relative references and having them both as category other makes renderer definitions easier.

@Witiko
Copy link
Owner Author

Witiko commented Apr 6, 2023

@olsak I am planning to tackle OpTeX support in the version 2.23.0 of the Markdown package (to be released at the end of April) and discuss it briefly at TUG 2023.

@Witiko Witiko mentioned this issue Apr 14, 2023
3 tasks
@Witiko Witiko reopened this Apr 21, 2023
@Witiko Witiko modified the milestones: 2.23.0, 3.0.0 Apr 27, 2023
@Witiko Witiko modified the milestones: 3.0.0, 3.1.0 May 19, 2023
@Witiko
Copy link
Owner Author

Witiko commented Jun 14, 2023

@olsak In #292, I have just added a minimal demo of using OpTeX with the Markdown package to file examples/optex.tex. Here is the resulting PDF document: optex.pdf

This demo will be included in Markdown 3.0.0, to be released later this month and to be presented at TUG 2023. This is a minimal viable product that mainly includes base markdown elements and not syntax extensions such as tables, tickboxes, or notes.

Support for more syntax extensions can be added as follows:

  1. Enable said extension in the \markdownOptions macro, for example pipeTables=true, and tableCaptions=true, for tables.

  2. Add an example of the markdown element from examples/example.md between \markdownBegin ... \markdownEnd. For example:

    This is a table:
    
    | Right | Left | Default | Center |
    |------:|:-----|---------|:------:|
    |    12 | 12   | 12      |   12   |
    |   123 | 123  | 123     |   123  |
    |     1 | 1    | 1       |    1   |
    
      : Demonstration of pipe table syntax.
  3. Define the corresponding renderer macros, for example the table renderers.

I have scheduled these additions to the August 2023 release. Contributions are appreciated.

@Witiko Witiko modified the milestones: 3.1.0, 3.2.0 Jun 27, 2023
@Witiko Witiko removed the markdown3 label Aug 30, 2023
@Witiko Witiko modified the milestones: 3.2.0, 3.3.0 Oct 1, 2023
@Witiko Witiko removed this from the 3.3.0 milestone Nov 28, 2023
@Witiko Witiko modified the milestone: 3.5.0 Jan 31, 2024
@Witiko Witiko added this to the 3.5.0 milestone Mar 29, 2024
@Witiko Witiko removed this from the 3.5.0 milestone Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request help wanted optex Related to the OpTeX interface and implementation plaintex Related to the plain TeX interface and implementation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants