Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a bundled flag for gdal-sys #517

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

weiznich
Copy link
Contributor

This commit introduces a new gdal-src crate which bundles gdal and builds a minimal version from source via build.rs

Fixes #465

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

Cargo.toml Outdated Show resolved Hide resolved
Cargo.toml Outdated


[patch.crates-io]
proj-sys = { git = "https://github.com/GiGainfosystems/proj", rev = "54716dd8955d4f0561ce9bf8a83610b605e3c007" }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently relays on a non-accepted patch to proj-sys: georust/proj#190

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 27c2a5e which changes this to use the master branch of proj-sys instead as the relevant patch is now merged there. There is still no release that contains this change yet.

Copy link
Contributor Author

@weiznich weiznich May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we still need georust/proj#192

.define("BUILD_GMOCK", "OFF")
.define("PROJ_INCLUDE_DIR", format!("{proj_root}/include"))
.define("PROJ_LIBRARY", format!("{proj_root}/lib/{proj_lib}"))
// enable the gpkg driver
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I require the geopackage driver for my usecase that's why I added these configurations. In the end we already depend on libsqlite3 via proj, so that should hopefully be fine.

That written: Anyone that tries to actually use that driver to create a geopackage dataset will likely run into: OSGeo/gdal#9135 (which hopefully will be fixed soon upstream, so it might be worth to pull in the fix into the bundled version afterwards.)

@weiznich weiznich marked this pull request as draft January 26, 2024 07:59
@weiznich
Copy link
Contributor Author

Marked as draft as I need to perform a final set of tests on MacOS and Windows.

gdal-sys/Cargo.toml Outdated Show resolved Hide resolved
.gitmodules Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
@@ -4,4 +4,7 @@
#![allow(clippy::upper_case_acronyms)]
#![allow(rustdoc::bare_urls)]

#[cfg(feature = "bundled")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing some subtler aspects of how dependencies work; why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That forces cargo to actually build and link that dependency. Otherwise it's clever and says that it is unused and therefore does not link the dependency.

@@ -1,6 +1,7 @@
# Changes

## Unreleased
- Add a `bundled` feature for `gdal-sys` that allows to build and statically link a minimal bundled version of gdal during `cargo build`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably mention that it only supports SQLite, GPKG, GeoTIFF, some other less popular formats, and probably no compression. I'm afraid that users will see this and expect it to work for COG, JPEG2000, NetCDF and so on.

Also see https://www.mail-archive.com/gdal-dev@lists.osgeo.org/msg40172.html.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following 19 drivers are supported by this configuration according to DriverManager::all():

Virtual Raster
Derived datasets using VRT pixel functions
GeoTIFF
Cloud optimized GeoTIFF generator
Erdas Imagine Images (.img)
In Memory Raster
Geographic Network generic file based model
Geographic Network generic DB based model
ESRI Shapefile
MapInfo File
VRT - Virtual Datasource
Memory
Keyhole Markup Language (KML)
GeoJSON
GeoJSON Sequence
ESRIJSON
TopoJSON
GeoPackage
SQLite / Spatialite

"libproj.a"
};

let res = cmake::Config::new("source")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For anyone wondering, cmake automatically sets CMAKE_BUILD_TYPE.

@lnicola
Copy link
Member

lnicola commented Jan 26, 2024

This looks all right, but as I mentioned in the related issue, I'm worried it's only going to be useful to you.

GDAL is pretty large, and its users might have a lot of expectations, but a version that only supports GPKG is going to be very surprising. Even your use case might be better served by https://docs.rs/geozero.

CC @rouault long-term, how do you feel about ExternalProject (SuperBuild)?

@weiznich
Copy link
Contributor Author

This looks all right, but as I mentioned in the related issue, I'm worried it's only going to be useful to you.

GDAL is pretty large, and its users might have a lot of expectations, but a version that only supports GPKG is going to be very surprising. Even your use case might be better served by https://docs.rs/geozero.

I agree with the point that users might have expectations that are not covered (yet) by the proposed bundling support, but I disagree with with the point that it only supports GPKG and therefore is not useful. See that comment for the list of supported drives. That list contains a few more useful drivers like that one for shapefiles, geojson, geotiff or kml. In addition gdal is more than just a tool that provides support for certain file formats. It also exposes other functionality that is helpful in certain situations, like support for spatial projections.

As for other drivers: It is certainly possible to enable other drivers as well. That mostly requires going through the list of supported drivers and classify them if they need external dependencies or not. The later ones can easily be enabled, while for the former ones more work is required. Depending on the required external dependency that might require adding another sys crate as dependency and enabling the bundling support there (like for example for netcdf, which already has bundling support in rust) or it might require writing that bundling support for that other sys crate first. I figured out to start with a minimal subset to have something working first.

Even your use case might be better served by https://docs.rs/geozero.

Trust me if I say that no geozero won't solve my use-case as it has to much assumptions around which geometry types exist and how they should be handled for my use-case. (It's specifically not about translating from one geometry format to another).

@lnicola
Copy link
Member

lnicola commented Jan 26, 2024

Yes, I think it's all right. We should probably document our features anyway, and that's a good place where we can explain the limitations of the bundled GDAL.

@weiznich
Copy link
Contributor Author

I just checked the documentation. GDAL already provides a list of dependencies for each driver:

https://gdal.org/drivers/vector/index.html
https://gdal.org/drivers/raster/index.html

Anything that lists "Built-in by default" as dependency can probably just be enabled without problems.

I also know that bundling support exists for the following other dependencies:

So yes, it's probably possible to just enable most of the expected drives, although I personally would prefer having all the native dependencies behind separate feature flags.

@rouault
Copy link
Contributor

rouault commented Jan 26, 2024

CC @rouault long-term, how do you feel about ExternalProject (SuperBuild)?

I miss some contextual elements to understand your question

@lnicola
Copy link
Member

lnicola commented Jan 26, 2024

If you're not familiar with it, ExternalProject is a CMake component that can download libraries, apply patches, then link your project with them. So it can be useful for people who want a custom build but don't want to build every dependency manually.

If GDAL adds that in the future, we can integrate with it without creating Rust libraries for that bundle the source code of the GDAL dependencies.

If it doesn't, that's fine, people can still use GeoTIFF, GPKG, SHP and a a few other formats with the approach here.

@rouault
Copy link
Contributor

rouault commented Jan 26, 2024

ok, I see. That's something I've considered, but I'd be hesitant to go into that business, as it equates to creating yet another packaging system, and for GDAL, with all its transitive dependencies, that can mean ~ 100 libraries for a full build... That said I do recognize that might be a recurring need for people in "unstandard" environments (Android, WASM, etc.) to have to rebuild the whole world. I'm not sure if that would belong to the OSGeo/GDAL repo itself, or if it might be a side repository where people from different teams can collaborate, and be independent of GDAL development itself. Might be worth raising the idea on gdal-dev.

@lnicola
Copy link
Member

lnicola commented Jan 26, 2024

Yeah, that makes sense. You've already got a full plate with GDAL and the other libraries you're maintaining, you don't need to get into a whole new packaging system unless you really want to.

It would only make a difference for this PR if you were already planning to add it.

Cargo.toml Outdated Show resolved Hide resolved
cmake = "0.1.50"

[features]
default = []
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to discussion which drivers should be build by default as cargo feature flags. We cannot easily say build everything and allow users to opt out as cargo feature flags are additive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess NetCDF and Zarr could be interesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The zarr driver so currently not not supported by the bundling. It would require figuring out how to bundle all the dependencies for that driver as well.

Netcdf is supported, as there is already a rust crate which provides bundling support and exposes the relevant information. Therefore it should be possible to enable it by default.

@weiznich
Copy link
Contributor Author

I've pushed a new version of the build script that adds feature flags for a lot of the built in drivers. If I enable all of them DriverManager::all reports support for the following 162 drivers:

List with 162 driver names
Virtual Raster
Derived datasets using VRT pixel functions
GeoTIFF
Cloud optimized GeoTIFF generator
Erdas Imagine Images (.img)
CEOS SAR Image
CEOS Image
JAXA PALSAR Product Reader (Level 1.1/1.5)
Ground-based SAR Applications Testbed File Format (.gff)
ELAS
Arc/Info Binary Grid
Arc/Info ASCII Grid
GRASS ASCII Grid
International Service for the Geoid
SDTS Raster
DTED Elevation Raster
Portable Network Graphics
JPEG JFIF
In Memory Raster
Japanese DEM (.mem)
Graphics Interchange Format (.gif)
Graphics Interchange Format (.gif)
Envisat Image Format
Maptech BSB Nautical Charts
X11 PixMap Format
MS Windows Device Independent Bitmap
SPOT DIMAP
AirSAR Polarimetric Image
RadarSat 2 XML Product
Sentinel-1 SAR SAFE Product
PCIDSK Database File
PCRaster Raster File
ILWIS Raster Map
SGI Image File Format 1.0
SRTMHGT File Format
Leveller heightfield
Terragen heightfield
Network Common Data Format
EarthWatch .TIL
ERMapper .ers Labelled
NOAA Polar Orbiter Level 1b Data Set
FIT Image
GRIdded Binary (.grb, .grb2)
Raster Matrix Format
OGC Web Coverage Service
OGC Web Map Service
EUMETSAT Archive native (.nat)
Idrisi Raster A.1
Golden Software ASCII Grid (.grd)
Golden Software Binary Grid (.grd)
Golden Software 7 Binary Grid (.grd)
COSAR Annotated Binary Matrix (TerraSAR-X)
TerraSAR-X Product
DRDC COASP SAR Processor Raster
R Object Data Store
OziExplorer .MAP
Kml Super Overlay
Planet Labs Mosaics API
CALS (Type 1)
OGC Web Map Tile Service
Sentinel 2
Meta Raster Format
Portable Pixmap Format (netpbm)
USGS DOQ (Old Style)
USGS DOQ (New Style)
PCI .aux Labelled
Vexcel MFF Raster
Vexcel MFF2 (HKV) Raster
GSC Geogrid
EOSAT FAST Format
VTP .bt (Binary Terrain) 1.3 Format
Erdas .LAN/.GIS
Convair PolGASP
NLAPS Data Format
Erdas Imagine Raw
DIPEx
FARSITE v.4 Landscape File (.lcp)
NOAA Vertical Datum .GTX
NADCON .los/.las Datum Grid Shift
NTv2 Datum Grid Shift
CTable2 Datum Grid Shift
ACE2
Snow Data Assimilation System
KOLOR Raw
ROI_PAC raster
R Raster
Natural Resources Canada's Geoid
NOAA GEOCON/NADCON5 .b format
NSIDC Sea Ice Concentrations binary (.bin)
Swedish Grid RIK (.rik)
USGS Optional ASCII DEM (and CDED)
GeoSoft Grid Exchange Format
Bathymetry Attributed Grid
S-102 Bathymetric Surface Product
Hierarchical Data Format Release 5
HDF5 Dataset
Northwood Numeric Grid Format .grd/.tab
Northwood Classified Grid Format .grc/.tab
ARC Digitized Raster Graphics
Standard Raster Product (ASRP/USRP)
Magellan topo (.blx)
SAGA GIS Binary Grid (.sdat, .sg-grd-z)
ASCII Gridded XYZ
HF2/HFZ heightfield raster
OziExplorer Image File
USGS LULC Composite Theme Grid
ZMap Plus Grid
NOAA NGS Geoid Height Grids
IRIS data (.PPI, .CAPPi etc)
Racurs PHOTOMOD PRF
Scaled Integer Gridded DEM .sigdem
TGA/TARGA Image File Format
OGCAPI
Spatio-Temporal Asset Catalog Tiled Assets
Spatio-Temporal Asset Catalog Items
Geographic Network generic file based model
Geographic Network generic DB based model
ESRI Shapefile
MapInfo File
UK .NTF
SDTS
IHO S-57 (ENC)
Microstation DGN
VRT - Virtual Datasource
Memory
Comma Separated Value (.csv)
Keyhole Markup Language (KML)
GeoJSON
GeoJSON Sequence
ESRIJSON
TopoJSON
GMT ASCII Vectors (.gmt)
GeoPackage
SQLite / Spatialite
WAsP .map format
PostgreSQL/PostGIS
ESRI FileGDB
AutoCAD DXF
AutoCAD Driver
FlatGeobuf
Geoconcept
Czech Cadastral Exchange Data Format
PostgreSQL SQL dump
French EDIGEO exchange format
Idrisi Vector (.vct)
Elastic Search
Carto
AmigoCloud
Storage and eXchange Format
Selafin
VDV-451/VDV-452/INTREST Data Format
NextGIS Web
MapML
General Transit Feed Specification
OGC Features and Geometries JSON
U.S. Census TIGER/Line
Arc/Info Binary Coverage
Arc/Info E00 (ASCII) Coverage
Generic Binary (.hdr Labelled)
ENVI .hdr Labelled
ESRI .hdr Labelled
ISCE raster

Notably this is still missing support for anything that depends on libexpat or libmysqlclient and likely a few other more specific dependencies. For libexpat there does not seem to be a crate with up to date rust bindings at all, the libmysqlclient-sys crate still misses bundling support. (With my diesel hat on, the last thing is something I want to address at some point).

It's up to discussion which of these drivers should be enabled by default.

@weiznich weiznich marked this pull request as ready for review January 30, 2024 13:40
@weiznich
Copy link
Contributor Author

I confirmed that this works on linux (x86_86), windows (msvc, x86_64) and macos (aarch64).

@weiznich weiznich force-pushed the feature/bundled_build branch 3 times, most recently from e2de466 to 46d2093 Compare April 3, 2024 10:00
@weiznich
Copy link
Contributor Author

weiznich commented Apr 5, 2024

It might sometimes be not possible to statically link both libpq and libnetcdf because both include a vendored copy of strlcat. There is the following upstream issue for this Unidata/netcdf-c#927

Cargo.toml Outdated Show resolved Hide resolved
@metasim
Copy link
Contributor

metasim commented Apr 8, 2024

@lnicola Before this PR is merged, could we consider cutting a 0.17 Release?. I say we go with what we have. (I'm not going to have time to wrap up #508 before then. :/) It's been in the wild for a while, and it would be nice to close that chapter before a big change like this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a bundled feature
5 participants