src: get rid of fp arithmetic in ParseIPv4Host #46326

tniessen · 2023-01-24T03:31:05Z

Even though most compiler should not actually emit FPU instructions, it is unnecessary to use floating-point arithmetic for powers of 2.

Also change some signed counters to unsigned integers.

Even though most compiler should not actually emit FPU instructions, it is unnecessary to use floating-point arithmetic for powers of 2. Also change some signed counters to unsigned integers.

targos · 2023-01-24T07:41:42Z

src/node_url.cc

  uint32_t val = 0;
  uint64_t numbers[4];
-  int tooBigNumbers = 0;
+  unsigned int tooBigNumbers = 0;


Out of curiosity, how does someone choose between unsigned int and uint32_t?

Taste, basically. There's no difference unless you're targeting Watcom C++ for DOS, where sizeof(int) == 2.

Of course Real Programmers(TM) don't care for repeating themselves and just write unsigned without the int.

This method is really quite something. Tobias cleans it up but it still looks super complicated. The code below is untested and off the cuff but I think that is what ParseIPv4Host's logic reduces to. The only thing I'm not completely sure about is whether e.g. http://00000000000000000000/ (over 19 chars) is considered a valid input.

if (length > 19) return; // max in octal or hexadecimal unsigned a, b, c, d, v, ndots = 0; char s[20]; memcpy(s, input, length); s[length] = '\0'; for (char* p = s; p = strchr(p, '.'); p++, ndots++); switch (ndots) { default: return; case 0: if (1 != sscanf(s, "%u", &v)) return; break; case 1: if (2 != sscanf(s, "%u.%u", &a, &b)) return; if (a > 255 || b > 0xFFFFFF) return; v = a << 24 | b; break; case 2: if (3 != sscanf(s, "%u.%u.%u", &a, &b, &c)) return; if (a > 255 || b > 255 || c > 0xFFFF) return; v = a << 24 | b << 16 | c; break; case 3: if (4 != sscanf(s, "%u.%u.%u.%u", &a, &b, &c, &d)) return; if (a > 255 || b > 255 || c > 255 || d > 255) return; v = a << 24 | b << 16 | c << 8 | d; break; } // parse okay, address in |v|

tl;dr: personally, I read unsigned int as "some smallish non-negative integer", whereas uint32_t makes me question why this variable has to be exactly 32 bits.

Ad-hoc thought process:

Let's assume that there is no obvious type with better semantics for this use case (e.g., size_t). Otherwise, that type should likely be used instead.

❌ If the variable must have the same size across all platforms, uint32_t is the logical choice (even though ILP32, LLP64, and LP64 always use 32-bit unsigned int anyway, and only more exotic architectures such as ILP64 or SILP64 deviate).

❌ If the algorithm requires or benefits from a specific size, then uint32_t is again the logical choice. This is particularly important if the algorithm relies on unsigned integer overflows (which are well-defined, unlike signed overflows).

❌ If the use case requires a fixed minimum size, then uint_fast32_t can be useful. However, the implementation might end up quietly relying on a specific type underlying uint_fast32_t.

64-bit architectures are likely to use 64-bit types for uint_fast32_t. The implementation might store values exceeding 32 bits in this variable, which will break if anyone tries to use the code on an architecture that uses a 32-bit type for uint_fast32_t.

Conversely, if an architecture uses a 32-bit type for uint_fast32_t, then assigning a uint_fast32_t value to a 32-bit variable works, but it will break if anyone tries to use the code on an architecture that uses a 64-bit type for uint_fast32_t.

❌ If the variable is mostly used to interact with some library, the type should match whatever the library uses. This is often true for OpenSSL, which for historic reasons frequently uses signed int values instead of semantically more appropriate types.

✔️ If none of that is true, and if the 16-bit minimum size of unsigned int is clearly sufficient, then unsigned int works just fine. Within Node.js, we can even rely on 32-bit unsigned int, but that's not necessarily true in general, e.g., on AVR.

Of course, following this argument, unsigned short would work just as well since it is also guaranteed to have a minimum size of 16 bits. However, aside from the (potentially) smaller size when allocating large arrays (e.g., unsigned short[16 * 1024] may be smaller than unsigned int[16 * 1024]), unsigned short has no benefit over unsigned int. It may even be slower since modern CPUs prefer sizeof(unsigned int) or sizeof(unsigned long) registers, and may have to mask the upper parts of those registers for computations on unsigned short. In fact, uint_fast16_t usually is either the same as uint32_t or uint64_t.

Regardless, what is important to me is the signedness of these variables. I know that this is not a widespread opinion, but to me, "signedness correctness" is almost as important as const correctness.

nodejs-github-bot · 2023-01-24T11:49:06Z

CI: https://ci.nodejs.org/job/node-test-pull-request/49148/

nodejs-github-bot · 2023-01-26T03:37:16Z

Landed in 8ba54e5

Even though most compiler should not actually emit FPU instructions, it is unnecessary to use floating-point arithmetic for powers of 2. Also change some signed counters to unsigned integers. PR-URL: #46326 Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

src: get rid of fp arithmetic in ParseIPv4Host

4341b6f

Even though most compiler should not actually emit FPU instructions, it is unnecessary to use floating-point arithmetic for powers of 2. Also change some signed counters to unsigned integers.

nodejs-github-bot added c++ needs-ci whatwg-url labels Jan 24, 2023

anonrig approved these changes Jan 24, 2023

View reviewed changes

lpinca approved these changes Jan 24, 2023

View reviewed changes

lpinca added the request-ci label Jan 24, 2023

github-actions bot removed the request-ci label Jan 24, 2023

This comment was marked as outdated.

Sign in to view

targos reviewed Jan 24, 2023

View reviewed changes

jasnell approved these changes Jan 24, 2023

View reviewed changes

github-actions bot mentioned this pull request Jan 25, 2023

CI Reliability 2023-01-25 nodejs/reliability#494

Open

12 tasks

anonrig added author ready commit-queue labels Jan 25, 2023

nodejs-github-bot removed the commit-queue label Jan 26, 2023

nodejs-github-bot merged commit 8ba54e5 into nodejs:main Jan 26, 2023

ruyadorno mentioned this pull request Feb 1, 2023

v19.6.0 proposal #46455

Merged

juanarbol mentioned this pull request Mar 3, 2023

V18.15.0 proposal #46920

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src: get rid of fp arithmetic in ParseIPv4Host #46326

src: get rid of fp arithmetic in ParseIPv4Host #46326

tniessen commented Jan 24, 2023

This comment was marked as outdated.

targos Jan 24, 2023

bnoordhuis Jan 24, 2023

tniessen Jan 24, 2023

nodejs-github-bot commented Jan 24, 2023

nodejs-github-bot commented Jan 26, 2023

src: get rid of fp arithmetic in ParseIPv4Host #46326

src: get rid of fp arithmetic in ParseIPv4Host #46326

Conversation

tniessen commented Jan 24, 2023

This comment was marked as outdated.

targos Jan 24, 2023

Choose a reason for hiding this comment

bnoordhuis Jan 24, 2023

Choose a reason for hiding this comment

tniessen Jan 24, 2023

Choose a reason for hiding this comment

nodejs-github-bot commented Jan 24, 2023

nodejs-github-bot commented Jan 26, 2023