Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broken floating-point source constants after setlocale #22176

Open
ashutosh108 opened this issue Apr 26, 2024 · 7 comments
Open

broken floating-point source constants after setlocale #22176

ashutosh108 opened this issue Apr 26, 2024 · 7 comments

Comments

@ashutosh108
Copy link

ashutosh108 commented Apr 26, 2024

This is a bug report for perl from a.voloshin@postgrespro.ru,
generated with the help of perlbug 1.43 running under perl 5.38.0.

Module: POSIX


Description
Perl 5.38.x (or, at least, it's POSIX module) seem to break floating-point
constants, making 0 < 0.5 false in some locales after setlocale() in BEGIN block.

Verified to work ok: v5.36.3
Verified as buggy: v5.38.0, v5.38.2

Steps to Reproduce
If I set
LC_NUMERIC=ru_RU.UTF-8
in the environment, this script prints "not ok" under Perl 5.38.x, but "ok" under Perl 5.36.3:

BEGIN {
    use POSIX qw(locale_h);
    setlocale(LC_NUMERIC, '');
}

print "perl: $^V\n";

if (0 < 0.5) {
    print "ok\n";
} else {
    print "not ok\n";
}

Of course, reproduction requires corresponding locale to be present in the system. On my machine, uncommenting corresponding locale name (ru_RU.UTF-8) in /etc/locale.gen and running sudo locale-gen does the trick.

Expected behavior
Floating-point constants should work with any locales. Yes, in ru_RU.UTF-8 locale
the decimal separator is ",", not ".". But that should not affect parsing the
source code itself. Besides, what are we supposed to write instead, 0 < 0,5 ? That would mean
something else!

To sum up: I expected 0 < 0.5 to be true in any locale, even after setlocale()
in the BEGIN block.

Originally this behaviour was found under Ubuntu 24.04 (which has Perl v5.38.2 shipped as package) with some other module doing "use POSIX / setlocale in the BEGIN block" trick for some other reasons, so it does affect real-world usage.


Flags

  • category=library
  • severity=high
  • module=POSIX

Perl configuration

Site configuration information for perl 5.38.0:

Configured by ashutosh at Fri Apr 26 19:37:42 MSK 2024.

Summary of my perl5 (revision 5 version 38 subversion 0) configuration:
  Commit id: 76298ae68aa7796f0ffc05095b127d23f4b2de8f
  Platform:
    osname=linux
    osvers=6.5.0-26-generic
    archname=x86_64-linux
    uname='linux z 6.5.0-26-generic #26~22.04.1-ubuntu smp preempt_dynamic tue mar 12 10:22:43 utc 2 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Dprefix=/home/ashutosh/bin/perl5-38-0'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cc'
    ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-O2'
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='11.4.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/x86_64-linux-gnu /usr/lib /usr/lib64
    libs=-lpthread -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -ldl -lm -lcrypt -lutil -lc
    libc=/lib/x86_64-linux-gnu/libc.so.6
    so=so
    useshrplib=false
    libperl=libperl.a
    gnulibc_version='2.35'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'


---
@INC for perl 5.38.0:
    /home/ashutosh/perl5/lib/perl5/x86_64-linux
    /home/ashutosh/perl5/lib/perl5
    /home/ashutosh/bin/perl5-38-0/lib/site_perl/5.38.0/x86_64-linux
    /home/ashutosh/bin/perl5-38-0/lib/site_perl/5.38.0
    /home/ashutosh/bin/perl5-38-0/lib/5.38.0/x86_64-linux
    /home/ashutosh/bin/perl5-38-0/lib/5.38.0

---
Environment for perl 5.38.0:
    HOME=/home/ashutosh
    LANG=en_US.UTF-8
    LANGUAGE=
    LC_ADDRESS=en_US.UTF-8
    LC_IDENTIFICATION=en_US.UTF-8
    LC_MEASUREMENT=en_US.UTF-8
    LC_MONETARY=en_US.UTF-8
    LC_NAME=en_US.UTF-8
    LC_NUMERIC=en_US.UTF-8
    LC_PAPER=en_US.UTF-8
    LC_TELEPHONE=en_US.UTF-8
    LC_TIME=en_GB.UTF-8
    LD_LIBRARY_PATH=/home/ashutosh/pg/master/ci_install/lib:/usr/local/lib:/opt/icu70.1/lib:/opt/icu69.1/lib:/opt/icu53.1/lib
    LOGDIR (unset)
    PATH=/home/ashutosh/pg/master/ci_install/bin:/opt/valgrind/bin:/home/ashutosh/perl5/bin:/home/ashutosh/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/opt/gdb-13/bin:/home/ashutosh/bin:/usr/lib/ccache:/home/ashutosh/.local/bin:/home/ashutosh/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
    PERL5LIB=/home/ashutosh/perl5/lib/perl5
    PERL_BADLANG (unset)
    PERL_LOCAL_LIB_ROOT=/home/ashutosh/perl5
    PERL_MB_OPT=--install_base "/home/ashutosh/perl5"
    PERL_MM_OPT=INSTALL_BASE=/home/ashutosh/perl5
    SHELL=/bin/bash
@khwilliamson
Copy link
Contributor

This would have been a release blocker for 5.38 if we had known about it then. But it has since been fixed, so will be in 5.40. I suspect it may be too complicated for a 5.38 maintenance release. But since its original date is so early after the 5.38 freeze, I could be wrong about that.

Bisecting the fix led to:

commit 5ba25c1
Author: Karl Williamson khw@cpan.org
AuthorDate: Tue Apr 11 16:06:11 2023 -0600
Commit: Karl Williamson khw@cpan.org
CommitDate: Mon Sep 11 19:58:21 2023 -0600

 Create S_native_querylocale_i() and use it
 
 This new function differs from the already-existing plain
 querylocale_i() in that it returns in the platform's native format,
 instead of the internal=to-perl one.
 
 The internal one is used generally so that code doesn't have to cope
 with multiple possible formats.  However, the format of the new locale
 in Perl_setlocale() is going to be in native format.  We effectively
 translate it into our internal one at the input edge, and that is used
 thereafter.
 
 But until this commit, the translation back to native format at the
 output edge was incomplete.
 
 This mostly worked because native format differs from locale.c
 internal format in just two ways:
 
 One is the locale for LC_NUMERIC.  perl keeps it generally in the C
 locale, except for brief intervals which higher level code specifies,
 when the real locale is swapped in.  (Actually, this isn't quite true.
 If the real locale is indistinguishable from C as far as LC_NUMERIC
 goes, perl is happy to use it rather than C, so as to save swapping.)
 locale.c had the code in it to translate the internal format back to
 native, so it worked for this case.
 
 The other is LC_ALL when not all categories are set to the same locale.
 Windows and Linux use 'name=value;' pairs notation, while things derived
 from BSD (and others) use a positional notation in which only the values
 are given, and the system knows which category a given value is for from
 its position in the string.  Perl worked fine for the name=value pairs
 notation, because that is the same as its internal one, so no
 translation got done, but until this commit, there were issues on
 positional platforms.  This seldom got in the way since most people, if
 they set the locale at all, will just set LC_ALL to some single 'foo'.
 
 What this commit effectively does is change Perl_setlocale() to return
 the value in the native format which the libc functions are expecting.
 This differs from what it used to return only on platforms which use the
 positional notation and only for LC_ALL when not all categories are set
 to the same locale.
 
 The new function subsumes much of the work previously done in
 Perl_setlocale(), and it is able to simplify some of that work.

@jkeenan
Copy link
Contributor

jkeenan commented Apr 26, 2024

I confirmed the problem on FreeBSD-13, and also confirmed that the problem is fixed in blead.

[perlmonger: p5p] $ perl gh-22176-locale-20240426.pl
perl: v5.32.1
ok

[perlmonger: v5.38.0] $ ./bin/perl -Ilib $P5P_DIR/gh-22176-locale-20240426.pl
perl: v5.38.0
not ok

[perlmonger: p5p] $ bleadperl gh-22176-locale-20240426.pl
perl: v5.39.10
ok

I checked out tag v5.38.2, created a branch and attempted to cherry-pick the commit @khwilliamson identified. I don't know how to interpret the result.

$ git cherry-pick 76298ae68aa7796f0ffc05095b127d23f4b2de8f
On branch gh-22176-attempt-20240426
You are currently cherry-picking commit 76298ae68a.
  (all conflicts fixed: run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

nothing to commit, working tree clean
The previous cherry-pick is now empty, possibly due to conflict resolution.
If you wish to commit it anyway, use:

    git commit --allow-empty

Otherwise, please use 'git cherry-pick --skip'

@mauke
Copy link
Contributor

mauke commented Apr 27, 2024

@jkeenan For cherry-picking, I get:

$ git cherry-pick 5ba25c116c8573b68a6103113d3b831e46f55bee
Auto-merging embed.fnc
CONFLICT (content): Merge conflict in embed.fnc
Auto-merging embed.h
CONFLICT (content): Merge conflict in embed.h
Auto-merging locale.c
CONFLICT (content): Merge conflict in locale.c
Auto-merging perl.h
CONFLICT (content): Merge conflict in perl.h
Auto-merging proto.h
error: could not apply 5ba25c116c... Create S_native_querylocale_i() and use it
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".
Recorded preimage for 'embed.fnc'
Recorded preimage for 'embed.h'
Recorded preimage for 'locale.c'
Recorded preimage for 'perl.h'

Which tells me the commit can't be cherry-picked in isolation.

I see that you are using a different commit hash (76298ae vs. 5ba25c1). Not sure what's up with that.

@jkeenan
Copy link
Contributor

jkeenan commented Apr 27, 2024

I see that you are using a different commit hash (76298ae vs. 5ba25c1). Not sure what's up with that.

Some error on my part. 76298ae was v5.38.0.

@ashutosh108
Copy link
Author

ashutosh108 commented Apr 29, 2024

My bisect led me to a different commit:

commit eacceb3
Author: Karl Williamson khw@cpan.org
Date: Sun Dec 31 08:00:53 2023
locale.c: Be sure to toggle into dot radix locale

Just in case, here is the bisect log: gh-22176-bisect-log.txt

Fortunately, this commit cherry-picks into v5.38.2 (==maint-5.38 currently) cleanly. It doesn't compile as it is, but it does after this obvious patch (needed because d65d6e9 have changed signature of set_numeric_locale in blead):

diff --git a/locale.c b/locale.c
index eacffab38f..b7e502b134 100644
--- a/locale.c
+++ b/locale.c
@@ -1920,7 +1920,7 @@ S_new_numeric(pTHX_ const char *newnum, bool force)
      * locale is indistinguishable from the C locale. */
     if (! force && strEQ(PL_numeric_name, newnum)) {
         if (! PL_numeric_underlying_is_standard) {
-            set_numeric_standard(__FILE__, __LINE__);
+            set_numeric_standard();
         }
 
         return;

The fixed v5.38.2+eacceb384 version works for me, the script from the first message prints "ok".

Could someone who knows Perl internals please verify that this kind of fix is acceptable to be included into Perl 5.38.x?

Upd: created a pull request #22185 with the fix described above.

@ashutosh108
Copy link
Author

Is there anything I could do for this to be fixed sooner? I'm new to Perl community and I don't know how the things are generally done here. If I need to squash commits or to clarify the commit message in PR #22185, please let me know. Or is there a mailing list to discuss this issue and review the fix for it?

In the mean time, I'm waiting for this bug to be fixed in 5.38.x because that version is going to be quite well-spread due to e.g. being present in the (fresh) current Ubuntu LTS 24.04.

@khwilliamson
Copy link
Contributor

@steve-m-hay It looks like we need to release a 5.38 maint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants