Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example-wait-on fails on GitHub waiting for localhost #802

Closed
MikeMcC399 opened this issue Feb 19, 2023 · 14 comments
Closed

example-wait-on fails on GitHub waiting for localhost #802

MikeMcC399 opened this issue Feb 19, 2023 · 14 comments

Comments

@MikeMcC399
Copy link
Collaborator

MikeMcC399 commented Feb 19, 2023

IMPORTANT

  • Please see the issue wait-on issues with Node.js 18 #811 for a better set of workarounds. The descriptions below were before I got into understanding the root cause better. I don't however want to delete the text in this issue even if it is no longer up to date. (Mar 2, 2023.)

The example-wait-on workflow fails on GitHub in some jobs waiting for localhost after the ubuntu-22.04 runner was migrated to use Node.js 18 as default. GitHub reports that the migration deployed on Feb 18, 2023 on the same date that example-wait-on began failing some of its test jobs. Other test jobs waiting for localhost succeed however.

Problem description

The workflow .github/workflows/example-wait-on.yml fails in jobs:

  • wait-using-custom-command-v9 / wait-using-custom-command
  • wait-on-vite-v9 / wait-on-vite

The last successful run was on Feb 17, 2023 with ubuntu-22.04 image Version: 20230206.1 using the default Node.js 16.19.0.
The first failed run was on Feb 18, 2023 ubuntu-22.04 image Version: 20230217.1 using the default Node.js 18.14.1.

See https://github.com/cypress-io/github-action/actions/workflows/example-wait-on.yml for workflow history.

wait-using-custom-command

          working-directory: examples/react-scripts
          start: npm start
          wait-on: 'npx wait-on --timeout 5000 http://localhost:3000'

shows

You can now view example-react-scripts in the browser.

  Local:            http://localhost:3000
  On Your Network:  http://10.1.0.40:3000

Note that the development build is not optimized.
To create a production build, use npm run build.

webpack compiled successfully
npm WARN exec The following package was not found and will be installed: wait-on@7.0.1
Error: Timed out waiting for: http://localhost:3000
    at /home/runner/.npm/_npx/04d57496964ca6d1/node_modules/wait-on/lib/wait-on.js:132:31

with wait-on --verbose enabled it shows that wait-on is trying to connect to the IPv6 address ::1, however the react server is only listening on the IPv4 address 127.0.0.1:

npm WARN exec The following package was not found and will be installed: wait-on@7.0.1
waiting for 1 resources: http://localhost:3000
making HTTP(S) head request to  url:http://localhost:3000 ...
  HTTP(S) error for http://localhost:3000 Error: connect ECONNREFUSED ::1:3000

... ECONNREFUSED repeated multiple times

wait-on(1747) Timed out waiting for: http://localhost:3000; exiting with error
Error: Timed out waiting for: http://localhost:3000

wait-on-vite

          working-directory: examples/wait-on-vite
          start: npm run dev
          wait-on: 'http://localhost:5173'

shows

waiting on "http://localhost:5173" with timeout of 60 seconds
/usr/local/bin/npm run dev

> example-wait-on-vite@2.0.0 dev
> vite


  VITE v4.0.4  ready in 261 ms

  ➜  Local:   http://localhost:5173/
  ➜  Network: use --host to expose
http://localhost:5173 timed out on retry 91 of 3, elapsed 90270ms, limit 90000ms
Error: connect ECONNREFUSED 127.0.0.1:5173

Investigation

GitHub ubuntu-22.04 image Version: 20230217.1 is set up as follows:

/etc/hosts

Run cat /etc/hosts
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
127.0.0.1 cpu-pool.com
127.0.0.1 MiningMadness.com
127.0.0.1 stratum-na.rplant.xyz
127.0.0.1 do-dear.com
127.0.0.1 web.do-dear.com
127.0.0.1 git.workflows.live
10.1.0.51 fv-az589-316.vc0434dfxmkevoezlasxvobd2h.cx.internal.cloudapp.net fv-az589-316

loopback

Run ip address show lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

Discussion

It is unclear why the wait-on step is causing problems using localhost on GitHub where Cypress manages to successfully test using localhost after the wait-on step when localhost has been substituted by a non-ambiguous alternative (see Workaround section below). Running Cypress tests locally succeeds. So the problem is only on GitHub.

A default installation of Ubuntu only assigns localhost to the ip4 address 127.0.0.1, not to the ip6 address ::1. GitHub however has ::1 localhost ip6-localhost ip6-loopback set up in addition.

Similar issues on GitHub have already been reported in:

The example-wait-on workflow uses different combinations of servers and wait-on providers to monitor the availability of a server on the network address localhost. Some succeed others fail.

Code Server Wait-on provider Result
examples/wait-on node built-in success
examples/react-scripts react-scripts built-in success
examples/react-scripts react-scripts npm wait-on failure
examples/wait-on-vite vite built-in failure

Workarounds

Use specific localhost

Tie tests to Node.js 16

Alternatively: If the failing tests are reverted to running under Node.js 16, then they succeed. Add the following:

      - name: Setup Node
        uses: actions/setup-node@v3
        with:
          node-version: 16

Suggestion

Replace the generic term localhost by a specific reference:

  • react-scripts 127.0.0.1
  • wait-on-vite ip6-localhost

for these webserver which are not listening on both IP stacks.

@MikeMcC399
Copy link
Collaborator Author

MikeMcC399 commented Feb 19, 2023

The migration of the default version of Node.js to 18 in GitHub runners has unmasked an issue with those webservers used for testing which only listen on either IPv4 or IPv6 network stacks but not on both.

Node.js 17 changed the order that IP addresses are returned by DNS. If web servers are not listening on IPv4 and IPv6 ports in parallel, then depending on the network setup for localhost, it may appear that they are not running.

There is a known issue with react-scripts (see facebook/create-react-app#11302) that the server does not bind to IPv6.

The vite server only binds to one stack and this depends on the version of Node.js and the result of the nameserver resolution (see vitejs/vite#10638 (comment)).

Due to GitHub assigning localhost to both IPv4 and IPv6 loopback addresses, trying to connect to localhost may not work.

Other webservers used in the examples listen on both IP stacks so do not exhibit any problem in this regard due to the Node.js migration. The non-impacted servers are:

Resolution

In example-wait-on

  • wait for react-scripts server using the IPv4 address 127.0.0.1
  • wait for the vite server using the IPv6 name ip6-localhost

both these references are unambiguous.

Reference

Node.js 17.0.0 in nodejs/node@1b2749ecbe "(SEMVER-MAJOR) dns: default to verbatim=true in dns.lookup() (treysis)".

See Node.js dns.lookup(hostname[, options], callback): History

"v17.0.0 The verbatim options defaults to true now."

"verbatim When true, the callback receives IPv4 and IPv6 addresses in the order the DNS resolver returned them. When false, IPv4 addresses are placed before IPv6 addresses. Default: true (addresses are not reordered). Default value is configurable using dns.setDefaultResultOrder() or --dns-result-order."

@MikeMcC399
Copy link
Collaborator Author

@beamery-tomht
Copy link

Had the same issue with a vite server using node 16

Fixed with:

wait-on: 'npx --yes wait-on --timeout 60000 tcp:8080'

Seems the vite server was returning 404 from HEAD requests, worked fine when sending either an accept header for text/html in the wait-on request or using TCP as above ^

@MikeMcC399
Copy link
Collaborator Author

@beamery-tomht

Many thanks for your tip using tcp to wait for vite!

I changed the example to
wait-on: 'npx wait-on --timeout 60000 tcp:5173'
and that worked fine.

I also tried it for the react-scripts example (using port 3000 instead) where it didn't work. It seems there is not a "one-size-fits-all" solution here.

Your npx --yes prompted me to think about pre-installing an explicit version of the npm package wait-on to avoid surprises of just getting the latest version of wait-on. I will submit a PR to suggest updating the examples in that respect.

@beamery-tomht
Copy link

I found that using wait-on -c config-file.json also helped when the config file explicitly set an accept header.

Would be good to know why your react-scripts example isn't working. If you use wait-on with verbose logging, it should tell you the reason the pings are failing. Vite was giving a 404 to the HEAD request when accept wasn't supplied. This logging could help you understand the cause of react-scripts and potentially find a "one-size-fits-all"

@MikeMcC399
Copy link
Collaborator Author

@beamery-tomht

Thanks again for your tips! Enabling wait-on --verbose shows

waiting for 1 resources: http://localhost:3000
making HTTP(S) head request to  url:http://localhost:3000 ...
  HTTP(S) error for http://localhost:3000 Error: connect ECONNREFUSED ::1:3000

This confirms the original finding that wait-on is trying to use IPv6 and react-scripts is listening on IPv4 only. It may be that GitHub needs to change their network setup and stop saying that localhost is ::1. It would be great if the Cypress.io team could comment. I didn't want to bring the issue to the attention of GitHub without that.

@MikeMcC399
Copy link
Collaborator Author

MikeMcC399 commented Feb 22, 2023

Hi @flotwig !

What do you think about involving GitHub at this point? They seem to have a non-standard /etc/hosts setup where localhost is defined for ::1 in addition to 127.0.0.1 on ubuntu-22.04. An out-of-the-box Ubuntu installation does not look like that.

Some Cypress tests are failing because the Cypress GitHub action wait-on function is going over the wrong stack IPv4 or IPv6 and thinks the server is not running because it's not listening on the stack that wait-on is pinging on. This impacts both the wait-on using the built-in ping and wait-on using the external wait-on package.

Is it too early to involve GitHub here (for instance through https://github.com/actions/runner-images/issues)?

@MikeMcC399
Copy link
Collaborator Author

MikeMcC399 commented Feb 23, 2023

If the vite server is started with

npx vite --host

instead of

npm dev (which is the equivalent of npx vite with no parameters)

then it responds locally to http://localhost:5173, http://127.0.0.1:5173 and http://[::1]:5173.

On GitHub it works around the issue described here. This is described in the Vite Server Options server.host documentation.

@flotwig
Copy link
Contributor

flotwig commented Feb 27, 2023

Resolved in #803 - thank you for diagnosing this issue, @MikeMcC399

@flotwig flotwig closed this as completed Feb 27, 2023
@MikeMcC399
Copy link
Collaborator Author

@flotwig

Thank you for merging! This is the best workaround we have available at this time.

We will need "Happy Eyeballs" implementing to deal with the Node.js >= 17 network dns resolution changes which are currently affecting a lot of repositories since the default Node.js update to 18 by GitHub on Feb 18, 2023. I will follow up on this as I try to understand more about what is going on!

@RicFer01
Copy link

Should this fix also CRA scripts or just Vite?

@MikeMcC399
Copy link
Collaborator Author

MikeMcC399 commented Feb 28, 2023

@RicFer01

Should this fix also CRA scripts or just Vite?

This was aimed to workaround problems in the examples only. I can't guarantee that the same workarounds will work everywhere else, on every operating system, and with all the different incarnations that CRA can produce, although you may find that replacing localhost by 127.0.0.1 is a good starting point, if you are having problems under Node.js 18 and later.

If you are using the built-in wait-on from cypress-io/github-action, for example
wait-on: 'http://localhost:8080'
and have problems, then this issue list is the right place to be.

If you are using the npm package wait-on, for example
wait-on: 'npx wait-on --timeout 5000 http://localhost:3000'
then you should use their issue list on https://github.com/jeffbski/wait-on/issues to find answers / report issues. I already have an open issue there myself under jeffbski/wait-on#137.

In some cases it may be possible simply to remove the wait-on parameter completely. Servers have been become faster and Cypress waits a little anyway if necessary. This wasn't an acceptable workaround for the wait-on examples however because they are by definition supposed to be using wait-on!

@RicFer01
Copy link

@RicFer01

Should this fix also CRA scripts or just Vite?

This was aimed to workaround problems in the examples only. I can't guarantee that the same workarounds will work everywhere else, on every operating system, and with all the different incarnations that CRA can produce, although you may find that replacing localhost by 127.0.0.1 is a good starting point, if you are having problems under Node.js 18 and later.

If you are using the built-in wait-on from cypress-io/github-action, for example wait-on: 'http://localhost:8080' and have problems, then this issue list is the right place to be.

If you are using the npm package wait-on, for example wait-on: 'npx wait-on --timeout 5000 http://localhost:3000' then you should use their issue list on https://github.com/jeffbski/wait-on/issues to find answers / report issues. I already have an open issue there myself under jeffbski/wait-on#137.

In some cases it may be possible simply to remove the wait-on parameter completely. Servers have been become faster and Cypress waits a little anyway if necessary. This wasn't an acceptable workaround for the wait-on examples however because they are by definition supposed to be using wait-on!

Thank you for the answer and the clarification, yeah I'm using the npm package, npx wait-on http://localhost:3000, the workaround using 127.0.0.1 for now seems to work, even if I'm not sure to get the reason exactly, I'll follow the issue you already opened.

jrw972 added a commit to jrw972/DDSPermissionsManager that referenced this issue Oct 31, 2023
Problem
-------

Cypress tests sporadically fail.

Solution
--------

Trying the approach from
cypress-io/github-action#802
jrw972 added a commit to jrw972/DDSPermissionsManager that referenced this issue Oct 31, 2023
Problem
-------

Cypress tests sporadically fail.

Solution
--------

Trying the approach from
cypress-io/github-action#802

Signed-off-by: Justin R. Wilson <wilsonj@unityfoundation.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants