Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Parquet Support #1705

Open
hokieg3n1us opened this issue Apr 8, 2020 · 1 comment
Open

[Enhancement] Parquet Support #1705

hokieg3n1us opened this issue Apr 8, 2020 · 1 comment

Comments

@hokieg3n1us
Copy link

Expand data types supported for ingest into GeoWave to include Apache Parquet.

Create SparkParquetIngestDriver to ingest Parquet from S3 bucket or HDFS directory. Should be configurable to support creation of geometry from a singular column contained WKT or WKB, or multiple columns contained longitude and latitude for point data.

Create SparkParquetExportDriver to export data from GeoWave to S3 bucket or HDFS directory in Parquet format. Should be configurable to support geometry being a singular column containing WKT or WKB, or multiple columns contained longitude and latitude for point data.

@michaeljfazio
Copy link

I've written a plugin for AWS Glue MetaStore which allows you to ingest data described in a Glue metastore:

e.g.

geowave ingest localToGw -f glue --glue.database geospatial --glue.table gdelt s3://location gdelt index1,index2

I only support ingesting parquet at the moment. It does work. But needs some cleanup. I'd be willing to contribute the source to the project if I can get sign-off from my employer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants