Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apostrophe in contractions is turned into \*(Aq, subsequently swallowed by pandoc #38

Open
teythoon opened this issue Jan 12, 2024 · 3 comments
Labels
bug Not as expected

Comments

@teythoon
Copy link

We produce manual pages using roff-rs, then render them as HTML for our web site. I have noticed that apostrophes in contractions and marking of possessive cases area not present in the produced HTML:

$ cat src/main.rs
fn main() {
    let mut r = roff::Roff::new();
    r.text(vec!["I've been a good boy.".into()]);
    println!("{}", r.render());
}
$ cargo run > astropof.1
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/foobr`
$ cat astropof.1
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
I\*(Aqve been a good boy.

$ man ./astropof.1|hd
00000000  49 27 76 65 20 62 65 65  6e 20 61 20 67 6f 6f 64  |I've been a good|
00000010  20 62 6f 79 2e 0a 0a                              | boy...|
00000017
$ pandoc -o astropof.txt astropof.1
$ cat astropof.txt
Ive been a good boy.

Now, I'm not an expert on roff, but one of the manual pages that I consult for advice on writing manual pages says not to use \(aq to escape ordinary apostrophes. https://man7.org/linux/man-pages/man7/groff_man_style.7.html says:

You should not use \(aq for an ordinary apostrophe (as in “can't”)

Through experimentation I discovered that pandoc renders both ' and \(aq just fine.

@epage epage added the bug Not as expected label Jan 12, 2024
@teythoon
Copy link
Author

To clarify: I think there are two issues here:

  • roff-rs unconditionally replaces apostrophes where it shouldn't (e.g. in contractions).
  • roff-rs uses the \*(Aq workaround which doesn't sit well with pandoc.

I have no idea what to do about either issue, but I wanted to report it.

@epage
Copy link
Contributor

epage commented Jan 15, 2024

Thanks for the report!

@teythoon
Copy link
Author

I just noticed the documentation of Roff::to_roff says:

Without special handling, apostrophes get typeset as right single quotes, including in words like “don’t”. In most situations, such as in manual pages, that’s unwanted.

That comment gets it wrong. In contractions, like "don't", you do want to allow renderers to use fancy glyphs, and in fact that is what rustdoc renders it to in the example:

$ echo -n don’t | hd
00000000  64 6f 6e e2 80 99 74                              |don...t|
00000007

Glyph e2 80 99 is RIGHT SINGLE QUOTATION MARK. Where you don't want that kind of fancy glyphs is code samples, which you expect people to copy and paste and have them work right.

You probably want render or to_writer instead of this method.

In fact, I switched to using this method, and this yields perfect results for me: both Debian's man and pandoc render apostrophes as 27 i.e. APOSTROPHE both in text as well as code blocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Not as expected
Projects
None yet
Development

No branches or pull requests

2 participants