Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node 20 appears to cause major slowdown in Post ccache step #181

Closed
firewave opened this issue Jan 29, 2024 · 13 comments · Fixed by #182
Closed

Node 20 appears to cause major slowdown in Post ccache step #181

firewave opened this issue Jan 29, 2024 · 13 comments · Fixed by #182

Comments

@firewave
Copy link

firewave commented Jan 29, 2024

The update to Node 20 appears to cause a slowdown in the Post ccache step.

In my case the step was taking single-digit seconds with the version used on Node 16 and always 2+ minutes when Node 20 was involved. The used platform or prior existence of a cache does not seem to have any impact.

See
#178 (comment)
#178 (comment)

@firewave
Copy link
Author

Continuing from #178 here.

My previous examples were flawed and your observations seemed correct. But let's look at a better example

https://github.com/danmar/cppcheck/actions/workflows/valgrind.yml?query=branch%3Amain.

This seems to be a good slice as it contains both versions and also both states.

The latest builds is descending order:

https://github.com/danmar/cppcheck/actions/runs/7644329997/job/20828411569 - 1.2.12 / cache found -> 2+ minutes
https://github.com/danmar/cppcheck/actions/runs/7687753686/job/20948157176 - 1.2.12 / cache found -> 2+ minutes
https://github.com/danmar/cppcheck/actions/runs/7687504907/job/20947611473 - 1.2.12 / no cache found -> 2+ minutes
https://github.com/danmar/cppcheck/actions/runs/7646437494/job/20835238755 - 1.2.11 / cache found -> 3 seconds
https://github.com/danmar/cppcheck/actions/runs/7644329997/job/20828411569 - 1.2.11 / no cache found -> 3 seconds

And some random builds from the past (unfortunately I did not find one with no cache):
https://github.com/danmar/cppcheck/actions/runs/7507173423/job/20440160792 - cache found / 5 seconds
https://github.com/danmar/cppcheck/actions/runs/7233590995/job/19709031614 - cache found / 4 second

@chirag-droid
Copy link
Contributor

I can take up this issue.

It is happening due to how Node 20 works

@chirag-droid
Copy link
Contributor

I think this issue needs to be fixed by @actions/ccache package or by NodeJS implementing a fix?

@firewave
Copy link
Author

I have no idea how Node works and what API it offers - but I think the proper fix would be in the code which is doing the HTTP request, telling the connection to be closed via the Connection: close HTTP header.

I am very vaguely remembering encountering some similar issue with keep-alive in Java quite a while ago.

@chirag-droid
Copy link
Contributor

I have opened the issue with actions/toolkit#1643

I am surprised many packages inside https://github.com/actions are using process.exit so the action doesn't hang.

I will look into this if I can fix it.

@hendrikmuhs
Copy link
Owner

Related to the other thread: Would a downgrade to node 18 help? We don't have to use node 20, all we need is not 16, because github put it EOL. In the long run we have to upgrade of course, but a downgrade would buy some time so github folks can figure out whats wrong.

@firewave
Copy link
Author

Related to the other thread: Would a downgrade to node 18 help?

Just updating to Node 18 should help with this issue since the behavior has only changed in Node 19.

We don't have to use node 20, all we need is not 16, because github put it EOL.

Actually we do: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/

But as you commented in the other ticket let's do this in incremental steps one problem at a time...

@chirag-droid
Copy link
Contributor

actions/toolkit seems unresponsive about issues and prs

@Crzyrndm
Copy link

Crzyrndm commented Mar 14, 2024

Any chance of #182 getting merged, even as a temporary fix? The current situation is unpleasant and there doesn't seem to be any activity from GH to resolve the root cause

@hendrikmuhs
Copy link
Owner

hendrikmuhs commented Mar 15, 2024

I can merge it once the test-failures are resolved (maybe it is just a matter of merging the latest changes).

jdupak added a commit to cvut/qtrvsim that referenced this issue Mar 30, 2024
jdupak added a commit to cvut/qtrvsim that referenced this issue Mar 30, 2024
@mweberxyz
Copy link

GitHub should be able to fix this. I identified the exact connection they are leaving open so hopefully every downstream consumer doesn't need to implement the process.exit hack: actions/toolkit#1702

As @chirag-droid said though.. they don't seem to take community PRs or issues at very high priority 😕

@Naros
Copy link

Naros commented Apr 18, 2024

Hi all, I've noticed that the post-ccache takes between 2.5 and 3 minutes while the restore takes 10-15 seconds. Is there any way I can speed this up on our builds?

@junaire
Copy link

junaire commented Apr 18, 2024

Hi, is there any workaround about this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants