Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Simplify npymath #22090

Merged
merged 16 commits into from Aug 23, 2022
Merged

MAINT: Simplify npymath #22090

merged 16 commits into from Aug 23, 2022

Conversation

mattip
Copy link
Member

@mattip mattip commented Aug 7, 2022

Towards solving #20880. So far this removes support for outdated Solaris, rearranges the headers a bit, adds some double C99 routines to mandatory math functions, removes emitting HAVE_* macros for mandatory functions, and removes code paths in the cases of missing mandatory functions.

It seems we cannot remove the complex math function implementations until we drop manylinux2014 (glibc 2.17) which does not go EOL until June 2024.

@mattip mattip marked this pull request as draft August 7, 2022 19:48
@mattip
Copy link
Member Author

mattip commented Aug 7, 2022

The result of #9567 from 2017 was to blocklist hypot on 32-bit windows. When I rerun the test file after compiling with cl.exe for x86 from VS2019, I get that the floating point flags are not changed:

>test_hypot.exe
FPU mode: 589855
If a right triangle has sides inf and 4.0, its hypotenuse is inf
FPU mode: 589855

Maybe in the mean time Microsoft fixed the problem? We do have a check_fpu_mode fixture on tests that should report when the mode changes

@rgommers
Copy link
Member

rgommers commented Aug 8, 2022

Thanks Matti, this looks like a pretty good start already. Some of the outdated long double formats also stood out to me, but probably only worth touching if they're a pain for moving build systems; otherwise best to wait and deprecate it all in one go later.

@mattip
Copy link
Member Author

mattip commented Aug 8, 2022

If we wish to vendor the npmath sources into the wheel, this will require thinking about how to package the build-time generated config.h (from @rgommers' experiment on vendoring npymath for scipy) file. Maybe we should rename it and put it in the regulare numpy include directory? This is similar to Python's _sysconfigdata__x86_64-linux-gnu.py (on linux) file that captures all the #define macros when building python

@rgommers
Copy link
Member

rgommers commented Aug 8, 2022

If we wish to vendor the npmath sources into the wheel, this will require thinking about how to package the build-time generated config.h

That can't work I think, the whole point of this header is to account for platform/OS/compiler-specific differences, so we can't just ship one config.h that works with the rest of the sources. EDIT: ah wait, you said "into the wheel" - that at least takes care of platform/OS

@charris charris changed the title Simplify npymath MAINT: Simplify npymath Aug 8, 2022
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI isn't happy yet, but the changes LGTM so far.

numpy/core/setup_common.py Outdated Show resolved Hide resolved
numpy/core/setup.py Show resolved Hide resolved
numpy/core/setup_common.py Show resolved Hide resolved
@mattip
Copy link
Member Author

mattip commented Aug 11, 2022

cygwin still needs a blocklisted log2. There is something off with one of the pow variants and macos. My first goal is to get CI passing and then clean up the resulting code.

@mattip
Copy link
Member Author

mattip commented Aug 11, 2022

also see #22108 #22043 to fix a warning about deprecated macOS images.

@mattip
Copy link
Member Author

mattip commented Aug 11, 2022

rabbit hole: powl is blocklisted for macoOS (and is not working correctly on macOS in this PR). In #8608 from 2017 it was noted that np.float128(2) ** 16383 incorrectly returns inf. PR #15074 was started to fix this but never finished. I wonder if removing the blocklist will solve both the inaccuracy from #8608 and the failure here.

Edit: looking again at what I did here, I removed the alternative npy_powl which caused the CI failure. I will leave #8608 for a future enhancement.

@mattip mattip force-pushed the simplify-npymath branch 3 times, most recently from 0300760 to 604e453 Compare August 15, 2022 10:31
@mattip
Copy link
Member Author

mattip commented Aug 15, 2022

I think I have gone about as far as I can go until we can drop manylinux2014 (in 2024), msvc 2015 (in 2025?) and 32-bit mingw (which may have fixed hypot and atan2, but we don't test it). Maybe by then cygwin will have fixed its complex functions as well.

@h-vetinari
Copy link
Contributor

until we can drop [...] msvc 2015 (in 2025?)

Let's please do this much earlier. I doubt that vs2015 support is actually still in working order, and the last blocker for moving away from vs2017 (AFAIU) is now finally about to be solved as well: conda-forge will move to vs2019+vc142 in a few days.

@matthew-brett
Copy link
Contributor

I agree with @h-vetinari - let's drop VS < 2019 . Does anyone have a good argument not to?

@rgommers
Copy link
Member

The overflow warning in the complex abs test looks like a problem.

@rgommers
Copy link
Member

Note that any pruning of symbols used in scipy.special is going to break already released SciPy versions on macOS arm64: MacPython/scipy-wheels#173 (comment). I don't think that's a hard blocker, as long as we've done one release of SciPy with a fix before the next NumPy release.

@mattip
Copy link
Member Author

mattip commented Aug 16, 2022

Note that any pruning of symbols used in scipy.special is going to break already released SciPy versions

This PR should not remove any of the exported functions in npymath.a. It does reduce the mandatory functions to the form double npy_sin(double x) { return sin(x); }, and removes the #define HAVE_SIN define. but as far as I can tell scipy does not use any of the HAVE_ macros in scipy.special.

@mattip mattip marked this pull request as ready for review August 16, 2022 08:36
@rgommers
Copy link
Member

I have started on the Meson build of npymath, so far I've got the checks to generate most of config.h implemented (with mandatory funcs based on this PR): https://github.com/numpy/numpy/compare/main...rgommers:numpy:meson?expand=1. It's getting a lot more understandable:)

Copy link
Member Author

@mattip mattip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready for review. npy_math_internal.h.src is the most complicated change. Most of what it does is to define functions like

float npy_sinf(float x) {
    return sinf(x);
}

A second pass through npymath could define macros for many of these mandatory functions rather than generating functions for them

#define npy_sin sin
#define npy_sinf sinf
#define npy_sinl sinl

Some "optional" functions have more correct implementations. These are the same, with a small refactoring tweak.

azure-pipelines.yml Show resolved Hide resolved
numpy/core/include/numpy/npy_math.h Show resolved Hide resolved
@@ -156,7 +157,7 @@ def check_funcs_once(funcs_name, headers=["feature_detection_math.h"]):
call_args=call_args,
headers=headers,
)
if st:
if st and add_to_moredefs:
moredefs.extend([(fname2def(f), 1) for f in funcs_name])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If add_to_moredefs=False then the function will not result in a HAVE_FUNC macro in config.h. There is no reason for the mandatory C99 functions to have that macro, if they are missing the NumPy build will fail during setup. I assume downstream users are not consuming these macros.

numpy/core/setup_common.py Outdated Show resolved Hide resolved
numpy/core/setup_common.py Show resolved Hide resolved
numpy/core/setup_common.py Show resolved Hide resolved
#include "numpy/numpyconfig.h"
#include "numpy/npy_cpu.h"
#include "numpy/utils.h"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removed headers are needed for simd dispatch, and are not part of the configuring done here.

numpy/core/src/common/numpyos.c Show resolved Hide resolved
numpy/core/src/npymath/ieee754.c.src Show resolved Hide resolved
@mattip
Copy link
Member Author

mattip commented Aug 21, 2022

#22139 was merged, once #22159 is resolved this is ready for another review

@rgommers
Copy link
Member

Needs a merge conflict resolved after gh-22159 was merged.

#else
/* ok on 64 bit posix */
return PyOS_strtol(str, endptr, base);
#endif
}

NPY_NO_EXPORT npy_ulonglong
NumPyOS_strtoull(const char *str, char **endptr, int base)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these now more or less pointless wrapper functions be deprecated somehow...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed there is more cleanup that can be done to turn one-line functions into defines (if they are exported) or to deprecate them (if they are internal). I was primarily targeting cleaning up the dead code from npymath in this PR. Maybe we should have a github project for code cleanup and add this as a task.

mattip added a commit to mattip/numpy that referenced this pull request Aug 22, 2022
@mattip
Copy link
Member Author

mattip commented Aug 22, 2022

I think this is a complete unit. There are at least two continuations:

  • refactoring the complex math functions in a similar way: turning the macros from HAVE_ to BLOCK_ and making the never-blocked ones mandatory
  • refactoring the mandatory functions so the npy_wrapperfunc(*args) {return func(*args);}; are header-only

@rgommers
Copy link
Member

refactoring the complex math functions in a similar way: turning the macros from HAVE_ to BLOCK_ and making the never-blocked ones mandatory

May make sense, not entirely clear given that complex.h will remain optional forever it looks like. But either way, probably not in this PR?

@seberg
Copy link
Member

seberg commented Aug 22, 2022

I just browsed through and the changes look great. I think I remember seeing that AIX might not define nextafter (not standard conform) due to using double-double. But even if that is the case, I am +1 to going ahead with this and putting the definitions back if absolutely necessary.

@rgommers
Copy link
Member

Last changes look good, I do want to do a bit of testing on macOS arm64 and verify no symbols went missing.

@mattip
Copy link
Member Author

mattip commented Aug 23, 2022

On a macmini running % arch -arm64 python runtests.py -m slow on main fails. Something about detecting the wrong long long type in dragon4.c:
numpy/core/src/multiarray/dragon4.c:3148:1: error: implicit declaration of function 'Dragon4_PrintFloat_Intel_extended128' is invalid in C99 [-Werror,-Wimplicit-function-declaration]

@rgommers
Copy link
Member

I'm not seeing that. Running python runtests.py -m full on macOS 12.4 (arm64) I only see these 3 (known, unrelated) failures:

FAILED numpy/f2py/tests/test_kind.py::TestKind::test_all - AssertionError: selectedrealkind(16): ex...
FAILED numpy/linalg/tests/test_linalg.py::TestDet::test_types[complex64] - RuntimeWarning: divide b...
FAILED numpy/linalg/tests/test_linalg.py::TestDet::test_types[complex128] - RuntimeWarning: divide ...

% arch -arm64 python

I recommend not doing this, it is not necessary and running the non-native architecture like that has given me problems since the PowerPC OSX days.

@mattip
Copy link
Member Author

mattip commented Aug 23, 2022

In my m1 arm64 build using the system-provided compiler, NPY_SIZEOF_LONGDOUBLE is being reset here

#if defined(__arm64__)
#define NPY_SIZEOF_LONGDOUBLE 8
#define NPY_SIZEOF_COMPLEX_LONGDOUBLE 16
which is wrong. NPY_SIZEOF_LONGDOUBLE should be 16. I wonder why you do not hit that in your build. Do you see the __APPLE__ and __arm64__ macros if you do
touch /tmp/foo.c; clang -dM -E foo.c |grep "\(APPLE\)\|\(arm64\)"?

@rgommers
Copy link
Member

rgommers commented Aug 23, 2022

Yes:

% touch /tmp/foo.c; clang -dM -E /tmp/foo.c |grep "\(APPLE\)\|\(arm64\)"
#define __APPLE_CC__ 6000
#define __APPLE__ 1
#define __arm64 1
#define __arm64__ 1

EDIT: I have also verified that I get the same result with Apple Clang and conda-forge provided Clang.

which is wrong. NPY_SIZEOF_LONGDOUBLE should be 16.

I don't think so - long double is the same size as double on arm64, just like on aarch64 and on Windows.

@rgommers
Copy link
Member

This is a giveaway that you are not actually running the arm64 binary that you think you are: Dragon4_PrintFloat_Intel_extended128 (note Intel). If you cannot reproduce the error without using -arch arm64 python, then it's safe to ignore.

@rgommers rgommers added this to the 1.24.0 release milestone Aug 23, 2022
@rgommers
Copy link
Member

rgommers commented Aug 23, 2022

Tested on Linux as well (including rebuilding SciPy and ensuring scipy.special tests pass), and verified that the number of defined symbols stayed unchanged (at 273) with: nm -a libnpymath.a | grep "T npy_" | wc -l

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give this a go, thanks a lot @mattip!

@rgommers rgommers merged commit d8c09c5 into numpy:main Aug 23, 2022
@charris
Copy link
Member

charris commented Aug 23, 2022

I don't think so - long double is the same size as double on arm64, just like on aarch64

I believe the spec is 16 on aarch64, but Apple broke the spec for M1 and made it the same as double.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants