Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: get rid of fp arithmetic in ParseIPv4Host #46326

Merged

Conversation

tniessen
Copy link
Member

Even though most compiler should not actually emit FPU instructions, it is unnecessary to use floating-point arithmetic for powers of 2.

Also change some signed counters to unsigned integers.

Even though most compiler should not actually emit FPU instructions, it
is unnecessary to use floating-point arithmetic for powers of 2.

Also change some signed counters to unsigned integers.
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. whatwg-url Issues and PRs related to the WHATWG URL implementation. labels Jan 24, 2023
@lpinca lpinca added the request-ci Add this label to start a Jenkins CI on a PR. label Jan 24, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jan 24, 2023
@nodejs-github-bot

This comment was marked as outdated.

uint32_t val = 0;
uint64_t numbers[4];
int tooBigNumbers = 0;
unsigned int tooBigNumbers = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, how does someone choose between unsigned int and uint32_t?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste, basically. There's no difference unless you're targeting Watcom C++ for DOS, where sizeof(int) == 2.

Of course Real Programmers(TM) don't care for repeating themselves and just write unsigned without the int.


This method is really quite something. Tobias cleans it up but it still looks super complicated. The code below is untested and off the cuff but I think that is what ParseIPv4Host's logic reduces to. The only thing I'm not completely sure about is whether e.g. http://00000000000000000000/ (over 19 chars) is considered a valid input.

if (length > 19) return; // max in octal or hexadecimal
unsigned a, b, c, d, v, ndots = 0;
char s[20];
memcpy(s, input, length);
s[length] = '\0';
for (char* p = s; p = strchr(p, '.'); p++, ndots++);
switch (ndots) {
default:
  return;
case 0:
  if (1 != sscanf(s, "%u", &v)) return;
  break;
case 1:
  if (2 != sscanf(s, "%u.%u", &a, &b)) return;
  if (a > 255 || b > 0xFFFFFF) return;
  v = a << 24 | b;
  break;
case 2:
  if (3 != sscanf(s, "%u.%u.%u", &a, &b, &c)) return;
  if (a > 255 || b > 255 || c > 0xFFFF) return;
  v = a << 24 | b << 16 | c;
  break;
case 3:
  if (4 != sscanf(s, "%u.%u.%u.%u", &a, &b, &c, &d)) return;
  if (a > 255 || b > 255 || c > 255 || d > 255) return;
  v = a << 24 | b << 16 | c << 8 | d;
  break;
}
// parse okay, address in |v|

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr: personally, I read unsigned int as "some smallish non-negative integer", whereas uint32_t makes me question why this variable has to be exactly 32 bits.


Ad-hoc thought process:

Let's assume that there is no obvious type with better semantics for this use case (e.g., size_t). Otherwise, that type should likely be used instead.

❌ If the variable must have the same size across all platforms, uint32_t is the logical choice (even though ILP32, LLP64, and LP64 always use 32-bit unsigned int anyway, and only more exotic architectures such as ILP64 or SILP64 deviate).

❌ If the algorithm requires or benefits from a specific size, then uint32_t is again the logical choice. This is particularly important if the algorithm relies on unsigned integer overflows (which are well-defined, unlike signed overflows).

❌ If the use case requires a fixed minimum size, then uint_fast32_t can be useful. However, the implementation might end up quietly relying on a specific type underlying uint_fast32_t.

  • 64-bit architectures are likely to use 64-bit types for uint_fast32_t. The implementation might store values exceeding 32 bits in this variable, which will break if anyone tries to use the code on an architecture that uses a 32-bit type for uint_fast32_t.
  • Conversely, if an architecture uses a 32-bit type for uint_fast32_t, then assigning a uint_fast32_t value to a 32-bit variable works, but it will break if anyone tries to use the code on an architecture that uses a 64-bit type for uint_fast32_t.

❌ If the variable is mostly used to interact with some library, the type should match whatever the library uses. This is often true for OpenSSL, which for historic reasons frequently uses signed int values instead of semantically more appropriate types.

✔️ If none of that is true, and if the 16-bit minimum size of unsigned int is clearly sufficient, then unsigned int works just fine. Within Node.js, we can even rely on 32-bit unsigned int, but that's not necessarily true in general, e.g., on AVR.

Of course, following this argument, unsigned short would work just as well since it is also guaranteed to have a minimum size of 16 bits. However, aside from the (potentially) smaller size when allocating large arrays (e.g., unsigned short[16 * 1024] may be smaller than unsigned int[16 * 1024]), unsigned short has no benefit over unsigned int. It may even be slower since modern CPUs prefer sizeof(unsigned int) or sizeof(unsigned long) registers, and may have to mask the upper parts of those registers for computations on unsigned short. In fact, uint_fast16_t usually is either the same as uint32_t or uint64_t.


Regardless, what is important to me is the signedness of these variables. I know that this is not a widespread opinion, but to me, "signedness correctness" is almost as important as const correctness.

@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. commit-queue Add this label to land a pull request using GitHub Actions. labels Jan 25, 2023
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Jan 26, 2023
@nodejs-github-bot nodejs-github-bot merged commit 8ba54e5 into nodejs:main Jan 26, 2023
@nodejs-github-bot
Copy link
Collaborator

Landed in 8ba54e5

ruyadorno pushed a commit that referenced this pull request Feb 1, 2023
Even though most compiler should not actually emit FPU instructions, it
is unnecessary to use floating-point arithmetic for powers of 2.

Also change some signed counters to unsigned integers.

PR-URL: #46326
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@ruyadorno ruyadorno mentioned this pull request Feb 1, 2023
juanarbol pushed a commit that referenced this pull request Mar 3, 2023
Even though most compiler should not actually emit FPU instructions, it
is unnecessary to use floating-point arithmetic for powers of 2.

Also change some signed counters to unsigned integers.

PR-URL: #46326
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@juanarbol juanarbol mentioned this pull request Mar 3, 2023
juanarbol pushed a commit that referenced this pull request Mar 3, 2023
Even though most compiler should not actually emit FPU instructions, it
is unnecessary to use floating-point arithmetic for powers of 2.

Also change some signed counters to unsigned integers.

PR-URL: #46326
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
juanarbol pushed a commit that referenced this pull request Mar 5, 2023
Even though most compiler should not actually emit FPU instructions, it
is unnecessary to use floating-point arithmetic for powers of 2.

Also change some signed counters to unsigned integers.

PR-URL: #46326
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. whatwg-url Issues and PRs related to the WHATWG URL implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants