Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-325: [Website] Update website deployment script to use latest LTS version of Node.js and Webpack 5.75.0 #326

Merged
merged 15 commits into from Mar 10, 2023

Conversation

kevingurney
Copy link
Member

@kevingurney kevingurney commented Mar 2, 2023

Overview

This pull request modifies the apache/arrow-site website deployment workflow (.github/workflows/deploy.yml) to use the latest LTS version of Node.js and Webpack 5.75.0 to work around the build issue described in #325.

Qualification

To qualify these changes, I:

  1. Submitted these changes to the main branch of the mathworks/arrow-site fork in order to trigger the gh-pages deployment workflow. I then selected gh-pages as the GitHub Pages deployment branch and verified that the site was deployed as expected to https://mathworks.github.io/arrow-site/. For an example of a successful workflow run, see: https://github.com/mathworks/arrow-site/actions/runs/4313253336/jobs/7524824999.
  2. I inspected the GitHub Actions workflow steps to ensure there are no errors.

Future Directions

  1. While qualifying with the fork deployment workflow, I realized that I needed to manually change the GitHub Pages deployment branch from asf-site to gh-pages in the "Pages" settings of the mathworks/arrow-site fork. This wasn't immediately obvious, and it isn't listed explicitly as a required step in the README.md of apache/arrow-site. It would helpful to add an explicit note about this step. I've captured this as [Website] Add note about the need to set GitHub Pages deployment source branch to gh-pages when previewing website changes on a fork of apache/arrow-site #327 and addressed it with PR GH-327: [Website] Add note about the need to set GitHub Pages deployment source branch to gh-pages when previewing website changes on a fork of apache/arrow-site #328.
  2. As described in the "Workarounds" section of the description of [Website] Website deployment workflow (deploy.yml) is failing due to Node.js 18 version bump in ubuntu-latest GitHub Actions runner image and Webpack usage of md4 hashing algorithm #325, there is still more we could choose to do to address the root cause of these build failures (the deprecation of the md4 hash algorithm in Node 18). This would include setting the output.hashFunction to xxhash64 for Webpack.
  3. We could move the workflow into a container to make it easier to reproduce the website build process on a local machine (see the discussion in the comments on this pull request).

Notes

  1. Thank you @sgilmore10 for your help with this pull request!
  2. Thank you to @avantgardnerio for your suggestion to move the deployment workflow inside of an ubuntu:latest container!

Closes #325.

Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kevingurney for the PR!

.github/workflows/deploy.yml Outdated Show resolved Hide resolved
Copy link

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

.github/workflows/deploy.yml Outdated Show resolved Hide resolved
.github/workflows/deploy.yml Outdated Show resolved Hide resolved
.github/workflows/deploy.yml Outdated Show resolved Hide resolved
.github/workflows/deploy.yml Outdated Show resolved Hide resolved
@kou kou changed the title GH-325: [Website] Website deployment workflow (deploy.yml) is failing due to Node.js 18 version bump inubuntu-latest GitHub Actions runner image and Webpack usage of md4 hashing algorithm GH-325: [Website] Website deployment workflow (deploy.yml) is failing due to Node.js 18 version bump in ubuntu-latest GitHub Actions runner image and Webpack usage of md4 hashing algorithm Mar 2, 2023
@kou kou changed the title GH-325: [Website] Website deployment workflow (deploy.yml) is failing due to Node.js 18 version bump in ubuntu-latest GitHub Actions runner image and Webpack usage of md4 hashing algorithm GH-325: [Website] Use the official Ubuntu container for deployment workflow to use old Node.js Mar 2, 2023
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use Node.js 12 on Ubuntu 22.04 with this change, right?

https://github.com/apache/arrow-site/actions/runs/4314961494/jobs/7528685953#step:13:17

npm WARN EBADENGINE   current: { node: 'v12.22.9', npm: '8.5.1' }

Node.js 12 is an EOL-ed Node.js. So it looks like that this is similar to 1. in GH-325.

It seems that 2. in GH-325 can be implemented with the following:

$ npm install webpack # this updates package.json and package-lock.json

If it works, it seems that it's a straitforword approach.

.github/workflows/deploy.yml Outdated Show resolved Hide resolved
@@ -27,17 +27,31 @@ jobs:
deploy:
name: Deploy
runs-on: ubuntu-latest
container:
image: ubuntu:22.04 # Ubuntu 22.04 is the latest LTS version as of April 21, 2022: https://wiki.ubuntu.com/Releases
Copy link
Member

@kou kou Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try reverting this and changes related to this? (e.g.: explicit apt-get, ImageOS: ubuntu22 and so on)

If we can use the ubuntu-latest image directly, we can use simpler deploy.yml and don't need to maintain Ubuntu version for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with just using ubuntu-latest and no container. However, I know there were some concerns about this approach expressed by @avantgardnerio (e.g. the proprietary nature and lack of debuggability of ubuntu-latest).

However, I can try this approach and if no one else in the community has reservations about it, then I am OK with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing out the workflow with no container now on https://github.com/mathworks/arrow-site/tree/GH-325-no-container.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I can understand the debuggability concern. If we make this job local debuggable, we should not use actions/* in ubuntu:XXX container like we did in apache/arrow: https://github.com/apache/arrow/blob/main/.github/workflows/cpp.yml#L92-L99

If we need local debuggability, we may need to choose another approach.

Copy link
Member

@kou kou Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
It seems to work. I prefer the with container approach to the no container approach but I want to hear opinions from others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming this.

How about the following steps to improve the current situation?

  1. We merge "no container" version https://github.com/mathworks/arrow-site/tree/GH-325-no-container because it's simpler than this version
  2. We create a Dockerfile based on ubuntu:22.04 and use it in CI and https://github.com/apache/arrow-site#using-docker (This prevents breaking @alamb 's use case accidentally)
  3. We extract command lines used in 2. as a shell script and fine tune the shell script to work on host Ubuntu 22.04 too (@avantgardnerio 's use case)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @avantgardnerio and @kou for your sharing your thoughts! I agree. This sounds like a reasonable path forward.

Just to clarify:

"We create a Dockerfile based on ubuntu:22.04 and use it in CI"

By "use it in CI" do you mean that you would like the GitHub Actions workflow to run inside of a Docker container based on this Dockerfile? If so, doesn't this conflict with 1. (which is to use the "no container" workflow)?

Sorry for the confusion. I just want to make sure we are all on the same page before moving forward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is that we just use docker build && docker run in GitHub Actions instead of container:.

(More specifically, we will use docker/build-push-action instead of raw docker build and push a built image to ghcr.io to reuse the built image on local for https://github.com/apache/arrow-site#using-docker . https://github.com/groonga/groonga-delta/blob/main/.github/workflows/docker.yml is a bit old but may help you to understand my idea.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my explanation isn't enough, please ask me more!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying @kou! This make sense!

I'll start working on these changes now.

@alamb
Copy link
Contributor

alamb commented Mar 7, 2023

Are we happy with this PR at the moment? Any concerns about merging it ?

@kou
Copy link
Member

kou commented Mar 7, 2023

Ah, sorry. I forgot to left a comment. I'll left a comment soon.

@avantgardnerio
Copy link

rgot to left a comment. I'll

I'm happy... I'll wait a little bit for @kou

@kou
Copy link
Member

kou commented Mar 7, 2023

Commented: #326 (comment)

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about revert ci/ changes for now and merge this pull request to fix the current CI failure?
We can work on the step 2. with ci/ changes as a follow-up pull request.

@kevingurney kevingurney changed the title GH-325: [Website] Use the official Ubuntu container for deployment workflow to use old Node.js GH-325: [Website] Upgrade website deployment script to use latest Node.js LTS and Webpack 5.75.0 Mar 9, 2023
@kevingurney
Copy link
Member Author

kevingurney commented Mar 9, 2023

Sounds good to me. I'll revert the ci/ changes.

I've created a follow up issue #331 to capture the desire to improve the container workflow.

@kevingurney
Copy link
Member Author

kevingurney commented Mar 9, 2023

I've reverted the changes and updated the PR title and description to more accurately reflect the changes.

@kevingurney kevingurney changed the title GH-325: [Website] Upgrade website deployment script to use latest Node.js LTS and Webpack 5.75.0 GH-325: [Website] Update website deployment script to use latest Node.js LTS and Webpack 5.75.0 Mar 9, 2023
@kevingurney kevingurney changed the title GH-325: [Website] Update website deployment script to use latest Node.js LTS and Webpack 5.75.0 GH-325: [Website] Update website deployment script to use latest LTS version of Node.js and Webpack 5.75.0 Mar 9, 2023
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Thanks!

@kou kou merged commit b955d79 into apache:main Mar 10, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants