New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4.11.2: build fails with glibc 2.34 #10250
Comments
It looks like you are missing the configure step? Do you have more context? On the Inria CI, 4.11.2 builds fine (except a test that should have been disabled on Windows). |
No that was executed after configure.
|
Autoconf 2.70 is not supported. Can you check that everything is fine with 2.69 ? |
But looks like it this is not an autoconf issue (?) Neveretheless I don't see any warnings in build log or so which may point on what exactly it may be :/ |
It is more a matter of reducing all unknown variables. The error is that somehow SIGSTKSZ ends up not being a constant on your system. |
You know .. that code looks a bit odd:
Than that static char is used to initialize:
That code could be OK if |
no sorry I was wrong .. |
Re: autoconf, if you don't want to use the same version of autoconf as the one we use, please do not run Re: Which C library are you using? Which Linux distribution? What does the following command prints?
|
I'm using my own distribution which is fesora rawhide based. However glibs still is from rawhide
Will try to make test without executing autoconf and back to you shortly. BTW GNU autotools. Looks like autoheader reports missing templates in configure.ac.
|
BTW probably it would be good to check what kind of changes proposes |
This is the source of the problem:
My Ubuntu 20.04 distro doesn't have Also, I'm positive this is an incorrect definition of |
We support released version of Fedora. In particular, Fedora 33 is part of our CI. But we don't support Rawhide, and we don't support your own distribution. |
I fully understand that however if tha cost fixing such issue would wi relatively low I would suggest to fix that ASAP as it may save a lot of time on racting on simillar issues :) |
This looks like it might be because of this glibc patch from last month, which sometimes redefines SIGSTKSZ as a non-constant: http://patches-tcwg.linaro.org/patch/48061/ I think glibc's position is that we have opted-in to this non-POSIX definition by defining |
If I read the patch discussion correctly, this issue concerns the unreleased glibc 2.34 ? |
Looks like other projests are affected by the same issue. |
But if |
Only FTR: I've asked on fedora list about that change in glibc is that it was not kind of mistake or not .. https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/BR5DU2NSKRZAJHEUWOI4H6ZIQQNVAXAR/ |
I doubt glibc is going to revert this particular decision, and (as @stedolan observes) have a "get out of POSIX card" by virtue of
The relevant commit in glibc is here: bminor/glibc@6c57d32 I don't entirely follow the compatibility logic described in the commit message, but glibc seems to have dug themselves into a hole by shadowing the kernel's definition to userspace, and then having to support older binaries. Musl doesn't seem to have this problem as far as I can tell. |
All right, we'll malloc the alternate stack. I just wish C standard libraries would take backward compatibility more seriously. |
I've just hit this one while trying to build OCaml 4.12 for Rawhide (which uses the new glibc with this patch). Did you have a suggested patch already or would you like me to try doing one? |
This is the patch I came up with, not the greatest thing in the world, but feel free to take it and adapt it if it's useful to you: https://pagure.io/fedora-ocaml/c/dfb5e954a04f59b0456cc4c0ddf3acaf22e0ff07?branch=fedora-35-4.12.0 |
@rwmjones It's a detail, but I think that we want to use |
In Glibc 2.34 and later, SIGSTKSZ may not be a compile-time constant. It is no longer possible to statically allocate the alternate signal stack for the main thread, as we've been doing for the last 25 years. This commit implements dynamic allocation of the alternate signal stack even for the main thread. It reuses the code already in place to allocate the alternate signal stack for other threads. Fixes: ocaml#10250.
Collecting an alternate stack while it is still in use can end up in a disaster. We would need to |
I don't know what |
I did install 4.13.2 but unfortunately I need elpi, and when I try to install it opam says it is only possible if I downgrade to 4.12.2 |
…l#10726) In Glibc 2.34 and later, SIGSTKSZ may not be a compile-time constant. It is no longer possible to statically allocate the alternate signal stack for the main thread, as we've been doing for the last 25 years. This commit implements dynamic allocation of the alternate signal stack even for the main thread. It reuses the code already in place to allocate the alternate signal stack for other threads. The alternate signal stack is freed when the main OCaml code / an OCaml thread stops. Fixes: ocaml#10250
Quick question, since one of my students (@cosmoviola) just ran into this: Say the student needs to be on a version of OCaml prior to the fix. Is there a way to use an old glibc somehow that doesn't have this problem? We are confused because this problem didn't manifest on his old computer, but now manifests on his new one. He is on Ubuntu 21.10. |
@dra27 has worked on backporting the fix to older OCaml versions (see #10725); it is not merged in the compiler upstream (it would be in version branches, not in a released bugfix, but your student could build and install the version branch), but there is growing consensus that it should be, and I think @dra27 is planning to include the patch in opam switches as well (see ocaml/opam-repository#19855 for 4.10-4.12 vrsions of the backport). Long story short: @dra27 has worked on this and, if all goes according to his plan, in a few days/weeks (Whenever we discuss backporting fixes, people wonder if we really have users that want to run OCaml 4.02 or what have you. Would you mind sharing your own reasons to use old OCaml versions (which version)? My guess would be that you are working on an old Coq development that only works with old Coq versions, that are themselves only compatible with older OCaml versions.) |
Thanks, that makes sense. We can work in a later OCaml for now, and revisit in a few weeks. Indeed the reason is that we need an old Coq (though not immediately, just eventually). The Coq plugin infrastructure changed really dramatically from 8.9.x to 8.10.x, so we need Coq 8.9.x for a plugin. Though if 4.10-4.12 already have that backported fix, that may be enough for Coq 8.9.x---I need to check what the assumptions are for 8.9.x, since I don't remember. We had been on 4.07.1+flambda, which worked for Coq 8.9.1 with a Coq plugin that also relies on a very specific version of SerAPI. I'd been too scared to mess around with other versions, but can check what the latest version of OCaml that works with that is. In general, though, the OCaml developments I use are always held back by any breaking changes to the Coq plugin infrastructure, which break my plugins (ironically, my proof repair plugins). |
(I wonder if the Coq people would consider backporting fixes to work with more recent OCaml versions, so that it gets easier in the future to install old Coq releases.) |
Using an old OCaml (or Coq) with a new OS is a bit antinomic. Maybe using a chroot could be an option? I suggest using schroot (with a Debian 11 "bullseye" environment, which has OCaml 4.11.1) in this case. A more trendy way might be using Docker. I don't know if it works well with WSL2, though. |
I don't really agree. (As you know well, but for context:) Coq is used in academic environments where it is common for a given software/proof project to sit unused for years, and then be dusted off and tested/used, and even re-developed. It is a common use-case to develop again with software that was written 5, 10 years ago, and we expect to be able to do this within the comfort of an up-to-date development system, rather than a time-capsule of the entire system as it existed at the time. (The tooling of the time was poorer, older systems may also not work for my hardware, etc.) (In the general case there are also security concerns with using frozen version of old systems, but Coq does not talk to the network and I guess some OSes, possibly Debian, have kept backporting security fixes on 5-year-old OSes.) My desire as a user is to use a given Coq development with all the other components of my system as up-to-date as possible. Now Coq, by design, has very fragile compatibility, old proof developments (and all-the-more plugins) will only work with a small range of versions. This forces an old Coq version on the user. Coq also tends to be tightly coupled with OCaml versions (this is not by design, but rather due to (imho fairly debatable) implementation choices depending on low-level, unspecified and unstable details of OCaml for performance reasons); this in turn forces an older OCaml version on the user. (I think the Coq team could/should consider backporting OCaml-compatibility fixes to older Coq releases, to make their users' lives easier. It may also make them more careful about not depending too tighly on unspecified parts, or putting more work in asking for better-specified and compatible interfaces for their low-level needs.) |
Personally, I would first go with a "time capsule" as you say to assess the worthiness of doing more efforts to make my target work with an up-to-date environment. And by "environment", I mean the OS as well as OCaml and Coq. I was suggesting using a chroot just to make progress, not as a long-term solution to the problem. |
ocaml/opam-repository#19855 now contains back-ports right the way back to 3.07 - it's only closed because the PR was causing too much pressure on opam-repository's CI system. As @gasche says, consensus is in the process of being reached, and there should at least be a plan next week! The discussion at the moment is where the patches should go, not whether opam-repository should carry them. For forward progress, I'm sure there will finally come a point where OCaml 3.07 needs too much effort to be compiled on a modern system, but the more years it goes on being possible, the more belligerent I seem to become in finding the required patches 😁 The fact that there's the occasional use for very old OCaml compilers at least means I don't feel totally mad doing that... |
Thanks, I appreciate it! |
Thank you @dra27 -- here's another Coq user who appreciates your back-porting work 👍 A note for other users whose Google search might lead them here: After running just |
Note: as far as I can tell, coq 8.13 should work fine with OCaml 4.13, which works fine with the last glibc. (But thanks for the usage instruction! Hopefully they will be replaced by "it just works" shortly.) |
(apologies for going off-topic)
When I tried to build coq 8.13.2 with OCaml 4.13.0 (using |
Updated separately to allow the previous patch to be used for multiple releases.
The back-ports have been done! 🥳 TL;DR the compilers work if installed via opam; there almost certainly will not be any new maintenance releases for 4.12 and earlier. Status:
|
Thanks a lot for your effort @dra27! |
@gasche FWIW, the Coq Platform is current built with OCaml 4.10. The reason for this choice is a compromise between various requirements. OCaml For some use cases, 4.07.1 is still the recommended version for Coq because of this issue coq/coq#7698 that is actively been worked on (coq/coq#15220). And regarding extending the compatibility of old Coq versions with new OCaml versions, this is mostly a problem of manpower I would say. |
That triggers a developer build, so all warnings are fatal. For a non dev build you can use |
(see ocaml/ocaml#10250 for background details)
The text was updated successfully, but these errors were encountered: