Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SmartyPants-Unicode handling of quotes, dashes, and ellipses #33

Open
jlevy opened this issue Oct 25, 2016 · 4 comments
Open

SmartyPants-Unicode handling of quotes, dashes, and ellipses #33

jlevy opened this issue Oct 25, 2016 · 4 comments
Labels

Comments

@jlevy
Copy link
Contributor

jlevy commented Oct 25, 2016

John Gruber's original Markdown has often been used with a long-standing hack called SmartyPants to improve typographic consistency on quotes, dashes, and ellipses. Python's Markdown package also implements it here.

A variant of this, that converts ASCII quotes, dashes, and ellipses to their appropriate Unicode equivalents, could be helpful in markdownfmt. Instead of converting to HTML entities, it would convert to Unicode, and then the Markdown doc would be consistent (including on GitHub, which does not by default do smarty-style conversion).

This is just another feature to note and discuss/consider. Inconsistent typographic usage is yet another pain point I've seen with large-scale collaborative Markdown.

It could also be implemented as a separate tool, perhaps. Gruber notes some (rare) algorithmic shortcomings like

'Twas the night...  -> ‘Twas the night...

But it's worth remembering authors can avoid that by using the correct oriented quotes in the original:

’Twas the night...
@dmitshur
Copy link
Member

dmitshur commented Oct 25, 2016

Hi @jlevy,

Thanks for the issue. This is a valid point for discussion.

I should first say that I'm aware of the smarty-pants option that blackfriday offers, but so far, I choose to ignore it. I already think markdown is way too complicated, and I want to make it as simple as possible.

So I currently use plain ASCII quotes everywhere and have very little desire in them to become something else.

This is just another feature to note and discuss/consider.

I'm not dismissing it completely and happy to discuss it, but realistically, I think it would have to be another project (possibly a fork) that takes this on. I think this is an interesting idea, and a tool like this has opportunities.

I just wanted to acknowledge this issue, and I'm ok with it staying open and having a discussion, but I don't plan to spend much time on this, and I'm unlikely to be able to accept PRs that implement this (please let me know before working on anything). There are other things that are occupying my budget for time and attention for now.

Thanks!

@tajmone
Copy link

tajmone commented Dec 19, 2016

Hi,

I agree with @jlevy, this is a much needed feature. Basically, people using markdownfmt are looking for a markdown-tidy tool to keep their markdown source files clean — not only for aesthetic reasons, but mostly to avoid diffing nightmares and problems with Git (tidy sources make it easier to view what really changed in a commit).

I was already looking into trying to implement this feature, but as far as I've understood blackfriday's smartypants is for html rendering only.

As @jlevy pointed out, it would be nice to have UTF-8 Unicode chars, instead of HTML entities.

Any tips on how I could try to implement this? (I'm fresh in Go lang)

Thanks

@dmitshur
Copy link
Member

people using markdownfmt are looking for a markdown-tidy tool to keep their markdown source files clean

I've touched on that point in #34 (comment), so I won't repeat that here.

I'm still not convinced that this is a great thing.

I haven't looked at any of the implementation matters here, but I suspect it might be tricky and probably not very clean.

At this time, my recommendation would be to experiment with a prototype and consider it going into a fork or separate tool. If the prototype works well, I would consider pulling it it, but I don't expect that at this time.

At this stage of this project, I'm not looking to grow the feature set of markdownfmt, instead, I want to keep it as simple as possible while still being viable to use. And in the last few years of using Markdown, I have had very little desire to have non-ascii quotes.

@tajmone
Copy link

tajmone commented Dec 24, 2016

Makes sense. I guess that the issue really should be opened on GitHub, asking for a markdown previewer that implements smart quotes and punctuation — so users could stick to a strict Ascii markdown source, and let converters/previewers handle it.

But for some reasons GitHub's markdown html renderer/previewer doesn't use smartypants — and this has some impact on users expectations, because having to use Alt codes to get an em-dash (even here, as I type this comment) is quite tyring.

Anyhow, it seems that all markdown cleanup tools agree that ascii only characters (or html entities as a last restort) are the only sound approach — of course, the latter make a document quite unreadable.

A solution could be writing some script to filter the output of markdownfmt — and since markdownfmt enforce an Ascii standard on quotes, dashes, etc., it would quite easy. A single batch/shell script cold handle invoking markdownfmt and pipe it to this script before rewriting the source file.

I guess that would be the simple approach, and coul be created in any language.

Thanks again (and Season Greetings)

Tristano

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants