Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: Use pyproj 3+ for PROJ 8+ migration #1808

Merged
merged 52 commits into from Aug 31, 2021
Merged

Conversation

lib/cartopy/crs.py Outdated Show resolved Hide resolved
lib/cartopy/crs.py Outdated Show resolved Hide resolved
lib/cartopy/crs.py Outdated Show resolved Hide resolved
lib/cartopy/crs.py Outdated Show resolved Hide resolved
lib/cartopy/crs.py Outdated Show resolved Hide resolved
@snowman2 snowman2 force-pushed the pyprojv2 branch 5 times, most recently from 0a41861 to 1b1b3c0 Compare June 24, 2021 01:35
y1 = self.area_of_use.north
lons = np.array([x0, x0, x1, x1])
lats = np.array([y0, y1, y1, y0])
points = self.transform_points(self.geodetic_crs, lons, lats)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@greglucas
Copy link
Contributor

It looks like you took some of the CRS code from Cython up into Python with the Transformers, so is there additional overhead if we are making lots of calls to the transform method for each segment of the projected geometry? Would adding cdef Transformer into the Cython files to use pyproj's Cython definitions remove that function call overhead? These are all just thoughts off the top of my head on why there could be a difference.

Maybe we should add in timing durations output for the tests to see if the N slowest are changing between this PR and the base, to figure out if it is specific tests that are slower or just overall everything is slower.
https://docs.pytest.org/en/latest/how-to/usage.html#profiling-test-execution-duration

@dopplershift
Copy link
Contributor

Really excited to see this. Looks like the last vestige of PROJ directly are for the geodetic stuff?

@snowman2 snowman2 force-pushed the pyprojv2 branch 12 times, most recently from 913eabf to 0f211c5 Compare June 25, 2021 01:43
@snowman2
Copy link
Contributor Author

Got more tests passing. Current state of things:

= 49 failed, 577 passed, 19 skipped, 2 xfailed, 12 xpassed, 704 warnings in 79.52s (0:01:19) =

The good news is that it seems to be much faster now.

@greglucas
Copy link
Contributor

That is great news! Do you know what you did to get the speed up again? It might be good to document that somewhere so that we don't introduce the speed regression later on.

@snowman2
Copy link
Contributor Author

Looks like the last vestige of PROJ directly are for the geodetic stuff?

Yes, that is the current state of things with this PR.

Do you know what you did to get the speed up again?

I am not entirely sure what changed as I just noticed it after fixing some bugs. Possible that bugs were causing the troubles 🤷‍♂️

@snowman2 snowman2 force-pushed the pyprojv2 branch 2 times, most recently from 96d1ceb to ab65ca8 Compare June 26, 2021 16:29
lib/cartopy/tests/test_crs.py Outdated Show resolved Hide resolved
lib/cartopy/tests/test_crs.py Outdated Show resolved Hide resolved
@snowman2
Copy link
Contributor Author

lib/cartopy/tests/test_linear_ring.py::TestMisc::test_at_boundary is a test I don't entirely understand. It is currently failing at the present moment.

With the original test and DEBUG=True in trace.pyx:

Setting line:
    177.5 ,  -79.912
    178.333 ,  -79.946
Projected as:
    177.5 ,  -79.912
    178.333 ,  -79.94599999999997
Bisecting from:  0.0  (
IN
)
    177.5 ,  -79.912
    178.333 ,  -79.94599999999997
t:  1.0
   => valid:  True
   =>  1.0 to 1.0
   => ( 178.333 ,  -79.94599999999997 ) to ( 178.333 ,  -79.94599999999997 )
Setting line:
    178.333 ,  -79.946
    181.666 ,  -83.494
Projected as:
    178.333 ,  -79.94599999999997
    -178.33399999999997 ,  -83.49399999999999
...

With pyproj:

Setting line:
    177.5 ,  -79.912
    178.333 ,  -79.946
Projected as:
    177.5 ,  -79.912
    178.333 ,  -79.946
Bisecting from:  0.0  (
IN
)
    177.5 ,  -79.912
    178.333 ,  -79.946
t:  1.0
   => valid:  True
   =>  1.0 to 1.0
   => ( 178.333 ,  -79.946 ) to ( 178.333 ,  -79.946 )
Setting line:
    178.333 ,  -79.946
    181.666 ,  -83.494
Projected as:
    178.333 ,  -79.946
    181.666 ,  -83.494
...

At the bottom of the current version, the projection actually changes the coordinates somehow. But, for the new version no change occurs. I would expect no change to occur because the source and destination CRS are both ccrs.PlateCarree(). Was the previous version incorrect or was there something else going on?

@greglucas
Copy link
Contributor

I think that is due to us allowing PlateCarree input ranging from (-180, 360), but the output is always wrapped to (-180, 180) unless you've moved your central_longitude. Does pyproj take a shortcut if source and destination are the same? We may want to force the transform anyway for situations like this...

@snowman2
Copy link
Contributor Author

snowman2 commented Aug 22, 2021

Funny you mentioned _copytobuffer. I just made this change: pyproj4/pyproj#905

@greglucas
Copy link
Contributor

greglucas commented Aug 22, 2021

Hm... I just tried to test that out and see if it helps, but I can't import cartopy when using the latest pyproj (v3.1.0 from source works).

Python 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cartopy
Traceback (most recent call last):

Something with the Mercator projection having an unknown lon_0.

lib/cartopy/crs.py", line 1708

File "pyproj/_crs.pyx", line 2323, in pyproj._crs._CRS.__init__
pyproj.exceptions.CRSError: Invalid projection: +proj=merc +ellps=WGS84 +lon_0=unknown +x_0=0.0 +y_0=0.0 +units=m +no_defs +type=crs: (Internal Proj Error: proj_create: invalid value for lon_0)

bisects to pyproj4/pyproj@ea2cc18

@snowman2
Copy link
Contributor Author

snowman2 commented Aug 23, 2021

Glad you were able to catch the change. It corresponds to pyproj4/pyproj#902.

The benefits of this change is that the CRS.from_* methods will generate a cartopy.crs.CRS object instead of a pyproj.CRS object.

This commit in the pyprojv32 branch should address that: snowman2@d903879. I didn't add the commit to this branch yet. Before doing so, it would be good to either have tests in the CI against latest pyproj or wait until the release from pyproj. This will ensure the API from pyproj will be stable before the change is added. I am thinking that the _MakerCRS class should be renamed to MakerCRS and moved to pyproj.crs namespace if cartopy needs to use it.

@snowman2
Copy link
Contributor Author

I think this should do it: pyproj4/pyproj#912

lib/cartopy/crs.py Outdated Show resolved Hide resolved
@greglucas
Copy link
Contributor

I just pushed up two commits to try and speed things up.

  • dea2d9d - Force Cartopy to always use the same global context. I don't think we guarantee any thread-safety and this is a pretty substantial gain, so perhaps we just force this on users? Can be backed out if people disagree.
  • 75856be - Put in a fast-path for CRS comparison tests. There was a lot of == checks going on and pyproj was slow with them.

Running the slowest example currently, the all UTM, we get the following timings:

         91688276 function calls (91602543 primitive calls) in 48.082 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  3743763   11.231    0.000   18.811    0.000 {method '_transform' of 'pyproj._transformer._Transformer' objects}
  3743763    5.794    0.000   37.685    0.000 transformer.py:581(transform)
  7487526    5.627    0.000   10.816    0.000 utils.py:50(_copytobuffer)
     8820    3.559    0.000   42.418    0.005 {cartopy.trace.project_linear}
  7487402    3.289    0.000    3.289    0.000 utils.py:37(_copytobuffer_return_scalar)
      182    2.001    0.011    2.002    0.011 {pyproj._transformer.from_crs}
 22506308    1.908    0.000    1.909    0.000 {built-in method builtins.hasattr}
  3753071    1.538    0.000    2.250    0.000 enum.py:283(__call__)
  3743765    1.417    0.000    3.656    0.000 enums.py:8(create)
  3743883    1.243    0.000    1.243    0.000 transformer.py:314(_transformer)
  3769806    1.027    0.000    1.647    0.000 compat.py:18(pystrdecode)
  7438652    1.021    0.000    1.021    0.000 utils.py:116(_convertback)
  3743840    0.967    0.000    1.466    0.000 enum.py:646(__hash__)

On master:

         5481006 function calls (5394125 primitive calls) in 7.729 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     8820    2.097    0.000    2.212    0.000 {cartopy.trace.project_linear}
    97/95    0.474    0.005    0.479    0.005 {built-in method _imp.create_dynamic}
     3837    0.272    0.000    0.359    0.000 coords.py:143(xy)
    21406    0.252    0.000    0.368    0.000 inspect.py:2889(_bind)
       62    0.165    0.003    0.165    0.003 {method 'transform_points' of 'cartopy._crs.CRS' objects}

@snowman2
Copy link
Contributor Author

Thanks @greglucas 👍, that is great!

I am -1 on requiring the global context. This impacts all other applications in the python environment that use pyproj. Also, with PROJ 8.1+ there isn't a significant benefit to using the global context in many instances.

pyproj == comparison is slow, so if we are comparing two Cartopy
CRSs, return a fast-path comparison. Let the pyproj comparison
handle the other cases.
@greglucas
Copy link
Contributor

Good point! I was only thinking directly within this project. I just dropped that commit.

I also looked into a few other speedups as well, and there are some gains to be had in trace.pyx of ~2-3x easily (eliminating some transform calls) and probably more with some additional refactoring. I think that would be a bigger overhaul project after this gets in though since it would be a significant rework.

@snowman2
Copy link
Contributor Author

I added 2db5efc for the scenario when the strings aren't equal, but the CRS are equal.

@snowman2
Copy link
Contributor Author

snowman2 commented Aug 30, 2021

I have wondered whether we could send 1000+ points into the transform function and then bisect those results instead of projecting 100 times on each segment.

Each call to _copytobuffer_return_scalar creates a single element python array for the Python Buffer API. It will likely be more efficient to pass in a single numpy array of points and get a numpy array back.

@greglucas greglucas merged commit 05672e4 into SciTools:master Aug 31, 2021
@greglucas
Copy link
Contributor

I think this is in a good spot for merging, so lets get it in for people to test out in master.

Thanks for the work in updating both this and pyproj, @snowman2!

@snowman2
Copy link
Contributor Author

Thanks for your help with this @greglucas 👍

@QuLogic
Copy link
Member

QuLogic commented Sep 10, 2021

Thanks for pushing this through to the finish @greglucas!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

segmentation fault when instantiating Proj4Error Port to use pyproj v2 Convert to proj v5 API
6 participants