Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make PDF and EPUB opt-in, rather than opt-out? #8359

Closed
astrojuanlu opened this issue Jul 21, 2021 · 12 comments · Fixed by #10115
Closed

Make PDF and EPUB opt-in, rather than opt-out? #8359

astrojuanlu opened this issue Jul 21, 2021 · 12 comments · Fixed by #10115
Labels
Needed: design decision A core team decision is required

Comments

@astrojuanlu
Copy link
Contributor

At the moment, every newly created project on Read the Docs has PDF and EPUB builds enabled in the UI:

Screenshot 2021-07-21 at 13-32-17 Edit Advanced Project Settings Read the Docs

If I understand correctly, this can be overridden by:

  1. Unchecking those boxes in the UI, or
  2. Using a .readthedocs.yaml file, where the default for formats is [] https://docs.readthedocs.io/en/stable/config-file/v2.html#formats

However, building PDFs is quite costly, error-prone, and unclear if being used by a majority of our users. On a related note, PDF errors are now passing silently #7884.

Given that the config file (2) in itself is opt-in, I wonder if we should make PDF and EPUB off by default on the UI.

Some questions that might help making this decision, although I don't think we should hard block the decision on those:

  • What percentage of active* projects use a config file on their default version?
  • What percentage of active* projects have zero downloads on their PDFs and EPUBs?

(*active = with X builds in the past Y months? with NNNN visits in the past Y months?)

Thoughts @readthedocs/core ?

@astrojuanlu astrojuanlu added the Needed: design decision A core team decision is required label Jul 21, 2021
@agjohnson
Copy link
Contributor

We can look at Google Analytics for an idea of the number of PDF downloads. We track some events on the flyout menu, for some numbers here (in the month of June):

  • 855,989 pdf download click events
  • 271,383 htmlzip download click events
  • 263,023 epub download click events
  • Only a couple html or zip download click events

This doesn't necessarily include all of the downloads though, because you can download outside the flyout menu. It probably is the most likely download point though.

@astrojuanlu
Copy link
Contributor Author

So, PDF download events are more or less 1.5 % of the pageviews. And I bet it's not evenly distributed, surely there are a handful of projects that concentrate most of the downloads.

In any case, it might sound like I'm advocating for removing them - I'm not, just to make it clear. But:

  • If projects want PDFs, I think they should enable them, and
  • Changing the UI defaults would make the UI consistent with the v2 config file

@humitos
Copy link
Member

humitos commented Jul 22, 2021

I'm on the fence here because I think the main benefit of the PDF is for readers without the internet. This is not always considered by the author of the documentation and in those cases, building it by default is 👍🏼 --I see this as a "feature of Read the Docs" (all the projects has PDF), not as a "feature of this particular documentation"

On the other hand, there are problems (like the ones you mentioned) that make me think that the benefit for the readers causes problems to authors. So, if we could easily fix them, I'd keep it enabled by default.

Finally, I think consistency is the best. Having them enabled by default in UI and config file v1, but not in v2 sounds as bad UX to me. It seems our last decision here (config v2) was to disable them by default --so, even I don't really like the outcome, we should probably be consistent and do the same for the other cases if possible.

@astrojuanlu
Copy link
Contributor Author

I'm on the fence here because I think the main benefit of the PDF is for readers without the internet. This is not always considered by the author of the documentation and in those cases, building it by default is 👍🏼

Yeah, I'm with you on this one.

I see this as a "feature of Read the Docs" (all the projects has PDF), not as a "feature of this particular documentation"

I agree, but at the same time, silently failing if the PDF doesn't build (#7884) or not having PDFs for projects with v2 configuration (~this issue) is not helping the feature. We discussed this a lot in #8106 with @ericholscher - at the beginning I was a big defender of LaTeX, but since then I have realized that PDFs are more fragile than I thought.

In my view, if we want to give some love to PDFs on RTD, we should have a roadmap to:

  1. Explore/promote/improve alternative PDF generation tools for Sphinx that do not depend on LaTeX, like https://github.com/rst2pdf/rst2pdf or https://github.com/brechtm/rinohtype
  2. Disable silent PDF failures, or make them configurable as @humitos suggested in Error in latexmk run doesn't cause build failure #7884
  3. Have a "config v3" (or whatever we end up having by that time) that leaves PDFs enabled

But for now, I'm advocating for consistency.

@ericholscher
Copy link
Member

ericholscher commented Jul 22, 2021

The goal was to eventually move everyone to a config file, but that obviously hasn't happened. I'm 👍 on changing the default, since it will only apply to new projects and be consistent. It will also save us some build resources, and users on build times & concurrency -- and most users only use the HTML.

@agjohnson
Copy link
Contributor

I'm also on the fence. I feel like this is a feature that we are really bad at promoting and developing, but should absolutely be focusing on more as it is a unique feature of RTD. But, having said that, making the option default off but also working on the feature or promoting the feature more can both happen simultaneously. So, I'm also 👍 on consistency here for now.

I was about to comment the rst2pdf is very unmaintained, but looks like new maintainers finally added python 3 support 💯

However, I don't know I trust rst2pdf anymore than I trust latex translation with Sphinx, at least at the scale we need a solution to work. rst2pdf was unmaintained for a long time and isn't as mature as sphinx, and we'll definitely hit similar edge cases with rst2pdf. Sounds like you're probably advocating for supporting both options. I would probably only be 👍 on adding a secondary builder type, but with the caveat there that rst2pdf/whatever would have to solve a lot of the problems we have with Sphinx/latex to be worth the cost. I don't think this will be the case though.

I believe there is a user extension that can already be used to use rst2pdf instead of latex for pdf generation too. It might be worth exploring, and this might also lead to solving a long standing bug with RTD where we only expect one PDF file output.

Before we start changing tooling, we should look into our data and identify what the actual issues with PDF generation are. Core team sees a lot of PDF failures because it's part of our job to support/debug them. The average user's interaction with PDF builds is probably unnoticeable in many cases however.

For an absolute out of left field option, we could discuss developing ePUB into a nicer experience. Really just a nicer theme would be required here, and there is prior art here that we're just not using. ePUB could be a default enabled build type if the experience is good. It's not as portable as PDF, but the experience with a nice theme is more usable across a larger number of devices. I've also gathered feedback on alternative formats in the past and MOBI support (which is kindle support, and a derivative of ePUB) was the most popular by far.

@ericholscher
Copy link
Member

ericholscher commented Jul 29, 2021

I'm 👍 on moving forward turning this off by default. I just had another support request where a user's build was breaking on PDF, and they didn't even know/need that. I think it's brittle enough to turn it off by default, and will also save us resources and users build time.

@astrojuanlu
Copy link
Contributor Author

A data point about the interest of PDFs: pdf is the fourth most searched term in our docs.

Screenshot 2021-10-20 at 15-21-59 Search Analytics Read the Docs

@astrojuanlu
Copy link
Contributor Author

Update: rst2pdf seems to be almost ready for Sphinx 4 compatibility, which is nice rst2pdf/rst2pdf#1020

@humitos
Copy link
Member

humitos commented Feb 2, 2023

Everything is good news! 📰 We now support building all the formats using the tooling you want (#9888). People can use SimplePDF, rst2pdf, and any other tool they want and still keep everything integrated in the same way as it currently is with LaTeX.

That said, I'd like to come back to this and prioritize:

  • turn off building PDF and ePUB by default (because of the reasons mentioned in this thread)
  • fail the build if any of the default commands for building the documentation breaks (this includes sphinx -b latex, latexmk, etc) -- Error in latexmk run doesn't cause build failure #7884
  • communicate the error clearly to the user at the top, in the same way as we are doing with the other errors

I think this will avoid confusions to authors, but also to readers that may download a half-baked PDF that's broken. Readers caring about these other formats will ask authors to generate them. Considering that this is now more flexible, and that it support more tools, I'd say authors are more likely to accept these proposal and care about having better quality extra formats. Everybody wins! 🥇

@humitos
Copy link
Member

humitos commented Feb 2, 2023

By the way, here is an example of using rinohtype to build a PDF on Read the Docs with a simple configuration: https://test-builds.readthedocs.io/en/pdf-rinohtype/

@humitos
Copy link
Member

humitos commented Feb 20, 2023

Given the amount of support requests we have received in the last weeks about failing PDF (now we show a confusing error 😅 ), I'd say that many people didn't even know they had PDF enabled. These PDF have been failing forever but the build was succeeding and the HTML were updated properly. So, they didn't even noticed there was an issue with their PDF.

Because of this reason, I'm 👍🏼 on disabling these formats by default, starting with new projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needed: design decision A core team decision is required
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants