Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Require 1003.1-2008 for libunix #10505

Draft
wants to merge 25 commits into
base: trunk
Choose a base branch
from
Draft

Conversation

dra27
Copy link
Member

@dra27 dra27 commented Jul 11, 2021

Continuing from #10397 (comment), a brief Sunday musing on the amount of code which we could drop by requiring Issue 7 for libunix. I've marked the PR as draft partly for discussion and partly because if we do adopt it, the configure checks are not yet correct. However, this does all pass precheck as it stands.

The Unix library builds with either:

  • Issue 7 (1003.1-2008) with XSI, and the spawn, IPv6 and timers optional extensions (_POSIX_VERSION >= 200809L, _XOPEN_VERSION >= 700 plus _POSIX_IPV6 and _POSIX_SPAWN).
  • Issue 6 (1003.1-2001) with XSI, and the spawn, IPv6, timers, synchronized IO and any of mapped files, shared memory objects or typed memory objects optional extensions (_POSIX_VERSION >= 200112L, _XOPEN_VERSION >= 600 plus _POSIX_IPV6, _POSIX_SPAWN, _POSIX_TIMERS, _POSIX_SYNCHRONIZED_IO and _POSIX_MAPPED_FILES | _POSIX_SHARED_MEMORY_OBJECTS | _POSIX_TYPED_MEMORY_OBJECTS)

We couldn't entirely go all-in on 1003-1.2008 because at least macOS and FreeBSD (or at least FreeBSD on the precheck workers) have everything needed, but only report 1003-1.2001.

The removal of emulation modes which are presumably not really being tested anywhere is quite tempting, as is the reduction in the number of tests needed in configure (POSIX clearly defines these feature probing macros, so it'd be nice, and quicker, to adopt them).

References: Issue 6 (2004) and Issue 7 (2008). I have checked the definition and conformance level of each macro/function check which has been removed, and I believe all the changes are correct.

otherlibs/unix/wait.c Outdated Show resolved Hide resolved
otherlibs/unix/open.c Outdated Show resolved Hide resolved
@stedolan
Copy link
Contributor

This looks great!

In your selection of POSIX options, are there compatibility concerns with including _POSIX_THREAD_SAFE_FUNCTIONS as well? (This gives us the _r functions, which are handy. See #3784, #5193)

@dra27
Copy link
Member Author

dra27 commented Jul 12, 2021

The selection of POSIX options was done by "what gets rid of all the HAS_ macros in libunix" 🙂 in Issue 7, _POSIX_THREAD_SAFE_FUNCTIONS is mandatory.

@dra27
Copy link
Member Author

dra27 commented Jul 12, 2021

I think that what wants to happen here is that we determine the range of POSIX macros required (basically what's above, adding _POSIX_THREAD_SAFE_FUNCTIONS for the 1003.1-2001 list). Then, configure will probe that, and only build unix if either the probe check passes or you explicitly pass --enable-unix-lib (and for the latter, it would display a warning that the unix library may not build, but was explicitly enabled).

@stedolan
Copy link
Contributor

The selection of POSIX options was done by "what gets rid of all the HAS_ macros in libunix" slightly_smiling_face in Issue 7, _POSIX_THREAD_SAFE_FUNCTIONS is mandatory.

Fair enough! (I might add the _POSIX_THREAD_SAFE_FUNCTIONS stuff in a later PR, it would avoid some weird bugs on weird platforms that we're currently subject to)

@xavierleroy
Copy link
Contributor

xavierleroy commented Jul 12, 2021

Naively, I was expecting something like "we require POSIX 2003 and we use the POSIX feature macros _POSIX_SPAWN, etc, to know which optional features are there and #ifdef between an optimized implementation and a failsafe or error-raising implementation".

The improvements compared with the existing code are 1- we can remove a bunch of #ifdef already by relying on POSIX 2003 core, and 2- a lot of dubious configure tests are replaced by POSIX feature macro tests.

It sounds like you want to require POSIX 2008 and a bunch of optional features, and fail configuration if they are not there. Isn't this going too far? Or am I missing something obvious?

@pmetzger
Copy link
Member

FYI, I suspect the various BSDs will happily fix their feature macros if it turns out they support a newer version of the APIs than they advertise. That won't help things in the short term of course.

@dra27
Copy link
Member Author

dra27 commented Jul 13, 2021

Ah, I had assumed that the point in saying that we require something is that we would not need to test for it in the code, thus removing no-longer-tested emulations. Obviously for caml_invalid_argument raising versions, that's largely cosmetic, but none of our tier 1 and tier 2 platforms lack any of these functions.

Going through it for the emulations, rather than just the "not available" exceptions, requiring 1003.1-2001 + X/OPEN means:

  • gethostname.c - no need for fallback to uname
  • lockf.c - F_GETLK, F_SETLK and F_SETLKW are all required, so no need to fallback to lockf
  • mkfifo.c - no need for fallback to mknod
  • mmap.c - no need for fallback to lseek since pwrite is available (but XSI)
  • sleep.c - no need for fallback to sleep since select is available
  • times.c - no need for fallback from times since getrusage is available (but XSI)
  • utimes.c - no need for fallback to utime since utimes is available (but XSI)
  • wait.c - no need for fallback to wait4 or have v7 status word emulation since waitpid and WIFEXITED etc. are available

These emulations would remain, but would be removed if we required 1003.1-2008 + X/OPEN:

  • mmap.c - mmap, munmap and msync can be assumed; but the POSIX probing should be improved if we went with 1003.1-2001 (in particular, the probe for msync is wrong and will not be triggering on FreeBSD)
  • sleep.c - nanosleep can be assumed, so the fallback to select can be removed; the POSIX probe for nanosleep can be improved if we stick with 1003.1-2001 (I'd wrongly said before that we needed the timers option, but that was a misreading of 1003.1-2008)

The only emulation which requires an option in both is the fallback to execve/execvpe in spawn.c.

@dra27
Copy link
Member Author

dra27 commented Jul 13, 2021

Rather than going all-in on removing all the #ifdefs, the (considerable) probing in configure could be replaced with a single check for _POSIX_VERSION > 200112L, just to display a warning if you manage to try OCaml on a very old system (which, as we saw at the weekend, apparently does still happen!). The change in the C files could then be limited to replacing HAS_ macros with the appropriate _POSIX_ tests (or, more likely, redefining the HAS_ macros in terms of the _POSIX_ checks). There are a few things which might be worth sticking with removing: unistd.h is in Issue 1 (1985) and dirent.h is in Issue 2 (1987), so we can probably assume those headers without an #ifdef around them.

I don't particularly mind, this was just an experiment. I do find it slightly strange that it's possible to configure (in error) OCaml such that Unix.bind might raise Invalid_argument vs knowing that if unix.cmxa is built, then Unix.bind, etc. cannot (as opposed to are phenomenally unlikely) do that.

@xavierleroy
Copy link
Contributor

I don't particularly mind, this was just an experiment. I do find it slightly strange that it's possible to configure (in error) OCaml such that Unix.bind might raise Invalid_argument vs knowing that if unix.cmxa is built, then Unix.bind, etc. cannot (as opposed to are phenomenally unlikely) do that.

Please don't give up! I think this change goes in the right direction, but we need some collective thinking (before writing even more code) to agree on where to put the bar between "configure refuses to build OCaml" and "lots of functions may fail at run-time".

I agree that Unix.bind should be guaranteed to work. On the other hand, assume an oldish by working BSD system that doesn't declare _POSIX_VERSION = 200809L because it lacks a couple of obscure functions for Issue 7, functions that OCaml doesn't use. It would be sad to refuse to build OCaml just because configure requires _POSIX_VERSION >= 200809L.

@xavierleroy
Copy link
Contributor

Based on #10505 (comment), "requiring 1003.1-2001 + X/OPEN" and keeping some emulations in mmap.c and sleep.c sounds very reasonable to me. Again, I'm more concerned about removing configure tests (too many, not reliable enough) than about eradicating all emulation code.

@xavierleroy
Copy link
Contributor

Rather than going all-in on removing all the #ifdefs, the (considerable) probing in configure could be replaced with a single check for POSIX_VERSION > 200112L, just to display a warning if you manage to try OCaml on a very old system (which, as we saw at the weekend, apparently does still happen!). The change in the C files could then be limited to replacing HAS macros with the appropriate POSIX tests (or, more likely, redefining the HAS_ macros in terms of the POSIX checks).

Sounds reasonable, except that I'd really like to get rid of the HAS_ macros, Windows ports permitting. Defining the HAS_ macros in terms of the POSIX checks just adds one level of indirection and makes the code much harder to read.

There are a few things which might be worth sticking with removing: unistd.h is in Issue 1 (1985) and dirent.h is in Issue 2 (1987), so we can probably assume those headers without an #ifdef around them.

Yes, of course.

@xavierleroy
Copy link
Contributor

One last thing: what about getaddrinfo ? Currently we have a big emulation for it, written in OCaml, not in C, but still not very trustworthy.

@dra27
Copy link
Member Author

dra27 commented Sep 1, 2021

Please don't give up!

Don't worry, I wasn't 🙂 I meant that it's not a pressing thing to move forwards for me, so I wouldn't have minded if you had preferred it closed.

How does this sound based on everything so far:

  • Hard error in configure if _POSIX_VERSION < 200112L or _XOPEN_VERSION < 600 (i.e. Issue 6 + X/Open support must be declared).
  • That means we can add this to runtime/caml/config.h and remove a few explicit settings of it (for example in otherlibs/unix/mmap.c):
#ifdef CAML_INTERNALS
#define _POSIX_C_SOURCE 200112L
#define _XOPEN_SOURCE 600
#endif
  • No probe for Issue 7 in configure and all the HAS_ stuff removed already in this PR remains removed.
  • Replace uses of HAS_MMAP with (_POSIX_MAPPED_FILES >= 200112L) - so a runtime error if memory mapped files aren't available but no configure error.
  • Replace uses of HAS_NANOSLEEP with (_POSIX_TIMERS >= 200112L), but no need to check for select - so no error if nanosleep isn't available, unix_sleep just quietly falls back to select if necessary.
  • I'd missed getaddrinfo_emulation - that becomes unreachable code (since unix_getaddrinfo no longer raises Invalid_argument), so it can go.

@dra27
Copy link
Member Author

dra27 commented Mar 18, 2022

Cross-referencing #11115 (comment) with a note that if nanosleep becomes assumed, then the select call in systhreads can be changed (cf. also dra27@4d25406)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants