Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Support character and character string arrays #19388

Merged
merged 4 commits into from Jun 6, 2022

Conversation

pearu
Copy link
Contributor

@pearu pearu commented Jul 1, 2021

This PR provides the following features:

  • support for wrapping Fortran functions with

    • character (e.g. character x)
    • character array (e.g. character, dimension(n) :: x)
    • character string (e.g. character(len=10) x)
    • and character string array (e.g. character(len=10), dimension(n, m) :: x)

    arguments, including passing Python unicode strings as Fortran character string arguments.
    This support includes comprehensive tests and documentation.

  • the generated extension modules don't use the deprecated NumPy CAPI anymore

  • improved f2py generated exception messages

  • numerous bug and flake8 warning fixes

  • various CPP macros that one can use within C-expressions of signature files are prefixed with f2py_. For example, one should use f2py_len(x) instead of len(x)

  • a new construct character(f2py_len=...) is introduced to support returning assumed length character strings (e.g. character(len=*)) from wrapper functions

  • introduce a hook to support rewriting f2py internal data structures after reading all its input files. This is required, for instance, for BC of scipy support where character arguments are treated as character strings arguments in C expressions.

Fixes #18684, #635, #6308, #4519, #3425, #10027

@pearu pearu marked this pull request as draft July 1, 2021 14:55
numpy/f2py/cfuncs.py Outdated Show resolved Hide resolved
@rgommers
Copy link
Member

rgommers commented Jul 5, 2021

Wow, that's an impressive list of things it fixes!

Perhaps this PR is not reviewable in detail, since it's both large and complex code. Perhaps good to review the tests and main approach in detail, and then if it passes and SciPy build is happy as well, then just merge it?

@pearu pearu requested a review from eric-wieser July 7, 2021 15:42
Copy link
Member

@melissawm melissawm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will not say I understand all the details :) I did take a look and one comment is that maybe we can make it clearer in the user guide when to use the new f2py_ macros (maybe just crosslinking the Advanced Usages and the Signature file guide.

I am also still investigating, but I think this breaks the SciPy tests. I'll check and see if I can pinpoint what is going on...

EDIT: Yes, this breaks the SciPy tests as character variables are not properly initialized. For example, running the SciPy tests, in subroutine _gesvx, in the flapack.pyf file, the variable fact should be initialized to "E", but it is instead being assigned the value E (note the lack of quotes!)

Here's a reproducer: saving the below as chartest.pyf

python module chartest
    interface
        subroutine chartest(fact, test)
            character optional, intent(in) :: fact = 'E'
            character, intent(out) :: test
        end subroutine chartest
    end interface
end python module chartest

generates chartestmodule.c with the following code:

/* Processing variable fact */
  if (fact_capi == Py_None) fact = E; else
    f2py_success = character_from_pyobj(&fact,fact_capi,"chartest.chartest() 1st keyword (fact) can't be converted to character");
  if (f2py_success) {
  /* Processing variable test */

It looks like character variables are being parsed as scalars and the initial value is going through _eval_scalar here

doc/source/f2py/advanced.rst Outdated Show resolved Hide resolved
numpy/f2py/src/fortranobject.c Outdated Show resolved Hide resolved
numpy/f2py/src/fortranobject.c Show resolved Hide resolved
numpy/f2py/src/fortranobject.c Show resolved Hide resolved
numpy/f2py/src/fortranobject.c Show resolved Hide resolved
@pearu
Copy link
Contributor Author

pearu commented Sep 1, 2021

There is another issue with scipy: consider https://github.com/scipy/scipy/blob/master/scipy/linalg/flapack_gen.pyf.src#L534

character optional,intent(in),check(*trans=='N'||*trans==<'T',\0,'C',\2>):: trans = 'N'

where the check expression assumes that trans has type char* while with this PR, it has char.

While this PR implements correct behavior for character, but in past, character was incorrectly treated as character*(1). So, we have a BC problem..

@rgommers
Copy link
Member

rgommers commented Sep 1, 2021

So, we have a BC problem..

Is there a way to work around it, like special casing SciPy within character array handling code in f2py?

@pearu
Copy link
Contributor Author

pearu commented Sep 2, 2021

The commit b176abe resolves the character BC issue by rewriting C-expressions after reading .pyf or Fortran files.
Now scipy builds with the current PR.

@pearu pearu requested a review from melissawm September 2, 2021 16:04
@pearu
Copy link
Contributor Author

pearu commented Sep 2, 2021

Using this PR, the current scipy builds ok and all scipy tests pass. The PR fixes a number of long-lasting issues and blocks other f2py-related PRs. Unless there will be feedback that requires considerable effort, I'd like to land it as soon as possible, say, in 5 days.

@melissawm @seberg @rgommers @eric-wieser @mattip @HaoZeke please review.

numpy/f2py/rules.py Outdated Show resolved Hide resolved
@HaoZeke
Copy link
Member

HaoZeke commented May 20, 2022

This should get a release note (at-least for the hook, and actually for all the bullet points).

TST: added test for issue numpy#18684

ENH: f2py opens files with correct encoding, fixes numpy#635

TST: added test for issue numpy#6308

TST: added test for issue numpy#4519

TST: added test for issue numpy#3425

ENH: Implement user-defined hooks support for post-processing f2py data structure. Implement character BC hook.

ENH: Add support for detecting utf-16 and utf-32 encodings.
@HaoZeke HaoZeke force-pushed the enh-f2py-character-support branch from 16406aa to 5bdeb35 Compare June 5, 2022 15:19
@HaoZeke HaoZeke force-pushed the enh-f2py-character-support branch from 0fc6adb to bcbd02b Compare June 5, 2022 18:53
@HaoZeke HaoZeke force-pushed the enh-f2py-character-support branch from bcbd02b to f20d90c Compare June 5, 2022 19:15
@HaoZeke HaoZeke marked this pull request as ready for review June 5, 2022 19:21
@HaoZeke
Copy link
Member

HaoZeke commented Jun 5, 2022

@melissawm @pearu I'm in favor of landing this ASAP, SciPy builds and all the tests pass. @charris, do you think the release notes are alright or should they be reworded?

@HaoZeke HaoZeke force-pushed the enh-f2py-character-support branch 2 times, most recently from a3bd111 to 1d9952d Compare June 5, 2022 19:48
@melissawm
Copy link
Member

Let's do this - it looks good to me now and we won't see any potential side effects unless this is in. Thanks a lot @pearu and @HaoZeke !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
7 participants