Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPv6 link-local address oddities on LinuxONE in CI #2368

Closed
Trott opened this issue Jun 26, 2020 · 8 comments
Closed

IPv6 link-local address oddities on LinuxONE in CI #2368

Trott opened this issue Jun 26, 2020 · 8 comments

Comments

@Trott
Copy link
Member

Trott commented Jun 26, 2020

I could be mistaken somewhere somehow, but I think the problem with a test on LinuxONE might be a misconfiguration on the machine. I'm not sure who to ping about this. Maybe @mhdawson knows?

IPv6 link-local addresses start with fe80:.

Below is the output of os.networkInterfaces() on that machine. As you can see, the lo interface does not have a link-local address. I believe this is a violation of RFC 4291 Section 2.8. (Wikipedia has it in more plain language: "Unlike IPv4, IPv6 requires a link-local address on every network interface on which the IPv6 protocol is enabled, even when routable addresses are also assigned.")

{
  lo: [
    {
      address: '127.0.0.1',
      netmask: '255.0.0.0',
      family: 'IPv4',
      mac: '00:00:00:00:00:00',
      internal: true,
      cidr: '127.0.0.1/8'
    },
    {
      address: '::1',
      netmask: 'ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff',
      family: 'IPv6',
      mac: '00:00:00:00:00:00',
      internal: true,
      cidr: '::1/128',
      scopeid: 0
    }
  ],
  'enccw0.0.1000': [
    {
      address: '148.100.86.21',
      netmask: '255.255.254.0',
      family: 'IPv4',
      mac: '02:c1:21:0e:b0:cb',
      internal: false,
      cidr: '148.100.86.21/23'
    },
    {
      address: '2620:91:0:649:c1:21ff:fe0e:b0cb',
      netmask: 'ffff:ffff:ffff:ffff::',
      family: 'IPv6',
      mac: '02:c1:21:0e:b0:cb',
      internal: false,
      cidr: '2620:91:0:649:c1:21ff:fe0e:b0cb/64',
      scopeid: 0
    },
    {
      address: 'fe80::c1:21ff:fe0e:b0cb',
      netmask: 'ffff:ffff:ffff:ffff::',
      family: 'IPv6',
      mac: '02:c1:21:0e:b0:cb',
      internal: false,
      cidr: 'fe80::c1:21ff:fe0e:b0cb/64',
      scopeid: 2
    }
  ]
}

Originally posted by @Trott in nodejs/node#14500 (comment)

@sxa
Copy link
Member

sxa commented Jun 29, 2020

Hmmm ... My linux laptop says the same for the loopback interface so I'm not sure this is a zLinux/Marist thing

Here's the ifconfig lo output from one Marist machine (not a Node one as I don't have access :'( ) which is effectively the same as my laptop:

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:208323 errors:0 dropped:0 overruns:0 frame:0
          TX packets:208323 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:26562838 (26.5 MB)  TX bytes:26562838 (26.5 MB)

@Trott
Copy link
Member Author

Trott commented Jun 29, 2020

Hmmm ... My linux laptop says the same for the loopback interface so I'm not sure this is a zLinux/Marist thing

Do you have a non-loopback IPv6 interface that has an address starting with fe80:? If so, I'd be interested to know what happens if you apply nodejs/node#14500 to master, compile, and run the test added in that pull request. But I also know that can be an annoying amount of work...

Now that you mention it, though, whether or not IPv6 loopback requires an fe80: interface, the test should still pass on LinuxONE in theory. There's another IPv6-enabled interface and the test selects the fe80: link local address and the test fails. I'm not sure why. I was hoping that by adding the fe80: to the loopback address, the test would pass and that would mask whatever the other problem is, but that's not a great way to go about this.

@sxa
Copy link
Member

sxa commented Jun 29, 2020

I have seen IPv6 related failures on zLinux elsewhere (outside Node) - I'll try and take a look at that PR in the next day and see what happens on one of my systems unless someone else gets there first :-)

@sxa
Copy link
Member

sxa commented Jun 29, 2020

test-dgram-udp6-link-local-address seems to have passed ok on my Marist system so I think we're probably ok

@Trott
Copy link
Member Author

Trott commented Jun 29, 2020

test-dgram-udp6-link-local-address seems to have passed ok on my Marist system so I think we're probably ok

That is both encouraging and discouraging at the same time. I wonder why it fails consistently on LinuxONE in CI, but seemingly nowhere else. Bug in the code? Bug in the test? Bug in LinuxONE? Misconfiguration in CI?

@mhdawson
Copy link
Member

mhdawson commented Jul 2, 2020

@sxa I would be useful if you clarified what you mean by "on my Marist system". If it passed on a different machine than the one we have in the CI we need to figure out what the difference is in terms of configuration.

@sxa
Copy link
Member

sxa commented Jul 2, 2020

@sxa I would be useful if you clarified what you mean by "on my Marist system"
I think a comment I posted before that got lost ... Not sure what happened there as I'd written more in that comment. Unfortunately there was quite a bit of detail that I've now lost so I'll try and remember ...

The issue is definitely with the network interface name that is selected by the test. If it has a - or . in the name then the test fails. I've verified this by changing network interface names and that changes whether the test works or not.

I was initially concerned that this might be a bug in libuv but I'm not sure that is the case now, although I haven't not managed to determine why. If I add debug statements to sockaddr_for_family in udp_wrap.c then the interface name in the failing case has %ifname stripped from it (and the test fails). If the name doesn't have a - or . then it doesn't get stripped then it passes.

On that basis we have two options:

  1. Continue to debug why an interface name with a - causes problems
  2. Modify the test to exclude such interfaces with code in the top loop like the following so we can get this fix in (since it appears to fundamentally work...) and raise a follow-on issue to understand why those names are problematic
	if (! ifname.includes(".") && !ifname.includes("-") )

If I make that change to the test case then it passes on my Marist RHEL77 system (otherwise it fails in the same way as you've seen in CI). As mentioned previously, Ubuntu Marist machines do not show the problem as they have an interface name of eth0

Interestingly I tried the same change on my RHEL x64 laptop and putting a . in the interface name (I just did it with my wifi adapter) seemed to stop the test from even showing up in the output from os.networkInterfaces() for some reason so somehow such "problematic" names (for want of a better description are getting skipped on x64 but not on s390x.

@Trott Trott changed the title IPv6 link-local address missing on LinuxONE (I think) IPv6 link-local address oddities on LinuxONE in CI Jul 2, 2020
@richardlau
Copy link
Member

This turned out to be a bug in core and was fixed via nodejs/node#34364.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants