Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async rendering of tags within a post #4926

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ppwwyyxx
Copy link
Contributor

@ppwwyyxx ppwwyyxx commented Mar 30, 2022

What does it do?

Fix #4923.

Currently, if we use async rendering in custom tag plugins, rendering of tags can run concurrently between posts. However, tags within the same post are still rendered sequentially.

This is because when rendering a post, hexo makes a single render call to nunjucks to render the entire post. Inside nunjucks, it renders all the tags sequentially.

Instead of making one render call per-post, this PR does the following:

  1. split the post into chunks
  2. render each chunk separately, with async call

This way, tags in different chunks will be rendered concurrently. This has accelerated my whole-site generation by 10x because I have some expensive tags.

The following changes in this PR are also necessary:

  • Add a HTML parser, parse5 as a dependency. parse5 is pretty popular and is also the backend of cheerio. Let me know if a different parser is preferred.
  • Modify a test because the output is changed. Note that the new output is equivalent to the old output.

Screenshots

N/A

Pull request tasks

  • Add test cases for the changes.
  • Passed the CI test.

Things I'm not sure of:

  1. Should this be made a per-site option, per-post option, or no option at all?
  2. Will this affect rendering results? I found that in nunjucks, users can define variables in one tag and use it in another. If each tag is rendered independently, this might stop working. However, I don't know if this matters at all because this "feature" doesn't seem useful to hexo.

@github-actions
Copy link

How to test

git clone -b master https://github.com/ppwwyyxx/hexo.git
cd hexo
npm install
npm test

@ppwwyyxx ppwwyyxx marked this pull request as draft March 30, 2022 03:47
@SukkaW
Copy link
Member

SukkaW commented Mar 30, 2022

  1. Will this affect rendering results? I found that in nunjucks, users can define variables in one tag and use it in another. If each tag is rendered independently, this might stop working. However, I don't know if this matters at all because this "feature" doesn't seem useful to hexo.

Although Hexo's documentation doesn't mention this behavior, I believe some sites are depending on this (each post has its own context of nunjucks) to work. If the changes have to be made, it will become a breaking change.

@ppwwyyxx
Copy link
Contributor Author

Thanks @SukkaW for comments. I noticed another issue also related to nunjucks context, noted at #4923 (comment). So I changed this to draft as more discussions are needed on the proper solution.

@ppwwyyxx ppwwyyxx marked this pull request as ready for review May 29, 2022 04:41
@ppwwyyxx
Copy link
Contributor Author

I managed to fix the abovementioned problem with a different approach (starting from top-level HTML nodes). Now all tests can pass.
@SukkaW can you give it a review? Thanks.

@ppwwyyxx
Copy link
Contributor Author

Can maintainers take a look? Thanks!

@SukkaW
Copy link
Member

SukkaW commented Aug 27, 2022

Can maintainers take a look? Thanks!

The changes themselves look great, but I am still afraid of introducing the change would break existing sites.

Also, introducing an HTML parser to the post rendering might have created too much overhead as well.

@coveralls
Copy link

coveralls commented Aug 27, 2022

Coverage Status

coverage: 99.531% (+0.002%) from 99.529% when pulling 01ca1ab on ppwwyyxx:master into 7b588e7 on hexojs:v7.0.0.

@ppwwyyxx
Copy link
Contributor Author

The changes themselves look great, but I am still afraid of introducing the change would break existing sites.

The new approach in this PR seems less likely to break existing behaviors. I'm not familiar with advanced usage of tags, do you have an example of what usage might be broken?

If rendering speed or backward-compatibility is a concern, can we make this a per-post option? This would speed up rendering of posts with a lot of slow tags.

@yoshinorin
Copy link
Member

@ppwwyyxx

I pull this PR to my local machine and run hexo server with test repository. But, it has the following error. Do you know what a problem is?

TypeError: parse5.serializeOuter is not a function
    at C:\Users\hexo\hexo\lib\hexo\post.js:435:29
    at Array.forEach (<anonymous>)
    at C:\Users\hexo\hexo\lib\hexo\post.js:433:22
    at tryCatcher (C:\Users\hexo\hexo\node_modules\bluebird\js\release\util.js:16:23)
    at Promise._settlePromiseFromHandler (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:547:31)
    at Promise._settlePromise (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:604:18)
    at Promise._settlePromise0 (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:649:10)
    at Promise._settlePromises (C:\Users\hexo\hexo\node_modules\bluebird\js\release\promise.js:729:18)
    at _drainQueueStep (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:93:12)
    at _drainQueue (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:86:9)
    at Async._drainQueues (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:102:5)
    at Immediate.Async.drainQueues [as _onImmediate] (C:\Users\hexo\hexo\node_modules\bluebird\js\release\async.js:15:14)
    at processImmediate (node:internal/timers:466:21)

P.S
I can provide a test repository which the error occurs.

@ppwwyyxx
Copy link
Contributor Author

@yoshinorin did you install the correct version of parse5 indicated in package.json? The function does exist: https://parse5.js.org/functions/parse5.serializeOuter.html, but not in older versions.

@yoshinorin
Copy link
Member

yoshinorin commented Oct 28, 2022

@ppwwyyxx

did you install the correct version of parse5 indicated in package.json?

Sorry, you're right.

The changes themselves look great, but I am still afraid of introducing the change would break existing sites.
Also, introducing an HTML parser to the post rendering might have created too much overhead as well.

@SukkaW
How about merge this PR once & ship with v7.0.0?

Just now I tried hexo server & hexo generate with 1400 posts (3829 files) locally, generation speeds are the same with v6.3.0. Both versions generate content with in 25 ~ 26s. (I tried 3 times cold generation)

If we will face a problem after release we just revert it and release a patch version.

@stevenjoezhang
Copy link
Member

We need to benchmark on some real-world Hexo websites to evaluate the impact of generation speed
@SukkaW @yoshinorin

@yoshinorin
Copy link
Member

@stevenjoezhang

We need to benchmark on some real-world Hexo websites to evaluate the impact of generation speed

As I already wrote in #4926 (comment) I tried it locally with customized https://github.com/LouisBarranqueiro/hexo-theme-tranquilpeak (hexo-theme-tranquilpeak was a theme used in my blog.)

I think performance is enough, but I don't know what @SukkaW is concerned about. As you know @SukkaW is more knowledgeable about hexo than me. @SukkaW may notice some performance issue.

@SukkaW
Copy link
Member

SukkaW commented Nov 25, 2022

I think performance is enough, but I don't know what @SukkaW is concerned about. As you know @SukkaW is more knowledgeable about hexo than me. @SukkaW may notice some performance issue.

What I am really worried about is the breaking change.

{{ let a = '1' }}

A lazy dog jump over a fox

{{ a }}

This should render `1`

Hexo currently supports this behavior. But after the PR, this will break (as nunjucks blocks would be rendered separately).

Unlike the other breaking changes we decided to introduce, there is no way to migrate this easily. Users would probably have to migrate hundreds of posts and end up deciding not to upgrade at all.

@yoshinorin @stevenjoezhang

@stevenjoezhang
Copy link
Member

Perhaps we could add a new per-site option to the _config.yml to turn this on or off @ppwwyyxx

@ppwwyyxx
Copy link
Contributor Author

Happy to make this optional. Shall it be a per-post option instead of per-site?

@SukkaW
Copy link
Member

SukkaW commented Nov 25, 2022

Happy to make this optional. Shall it be a per-post option instead of per-site?

I'd prefer to have both: an option defined in _config.yml, and can opt-in/opt-out per post.

@stevenjoezhang stevenjoezhang added this to the 7.0.0 milestone Nov 26, 2022
@ppwwyyxx
Copy link
Contributor Author

I added a per-post option async_tags. I found it a bit confusing to add a site-wise option:

  • If I call the site-wise option async_tags then it's confusing because tags across posts can always be rendered async regardless of this option. This PR is only about tags in a post.
  • If I call the site-wise option something like async_tags_within_post, then it's too long and doesn't match the name of the per-post option.

I'd appreciate any suggestions on naming.

Also, I guess a site-wise option is not that important? In most cases, this option doesn't matter because I think (1) slow tags are rare (2) tags that depend on other tags are rare. So a per-post option might be already sufficient for users.

@renbaoshuo renbaoshuo mentioned this pull request Jan 8, 2023
6 tasks
@ppwwyyxx
Copy link
Contributor Author

ppwwyyxx commented Apr 6, 2023

I wonder if this can be merged!

SukkaW

This comment was marked as outdated.

@SukkaW SukkaW self-requested a review April 6, 2023 09:54
Copy link
Member

@SukkaW SukkaW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally don't like using parse5. parse5 causes way way way too much overhead during rendering (imagine you have 1000 posts and you will have to parse HTML 1000 times). I am wondering if we could have a more efficient way to do this.

@ppwwyyxx
Copy link
Contributor Author

ppwwyyxx commented Apr 6, 2023

In the current PR the option is opt-in per-post, which means users would have to enable this option 1000 times to experience that kind of overhead for 1000 posts :) There is no overhead if users don't enable the option.

Since we have to parse HTML to know where tags are, I feel there is not much room to improve. Unless nunjucks itself can support such feature, then we don't have to parsing on hexo's side.

@stevenjoezhang stevenjoezhang changed the base branch from master to v7.0.0 July 9, 2023 17:37
@stevenjoezhang stevenjoezhang changed the base branch from v7.0.0 to master October 25, 2023 05:02
Copy link

How to test

git clone -b master https://github.com/ppwwyyxx/hexo.git
cd hexo
npm install
npm test

@uiolee uiolee added javascript Pull requests that update Javascript code enhancement New feature or request labels Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request javascript Pull requests that update Javascript code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Concurrent rendering of nunjucks tags within a post
6 participants