Respect spark.catalog.currentDatabase instead of hardcoded "default" #27

NJordan72 · 2018-04-13T22:07:00Z

In the event that a database isn't defined the behavior falls back to using a hardcoded "default" database. Instead we should respect spark.catalog.currentDatabase.

I have a simple pull request prepared if folks are interested.

rdblue · 2018-04-18T01:31:50Z

I'm open to changing this, but I don't think that spark.catalog.currentDatabase is necessarily the right choice. That is the current database for the global catalog, but there's no guarantee that Iceberg tables use that catalog. Should Iceberg have a way to set the default database instead? Or maybe this should be handled at a higher level and database should always be required.

NJordan72 · 2018-04-19T16:33:04Z

I don't have strong opinions here. To add a little more information here I ran into an issue as part of the Convert Table to Iceberg example. As part of SparkSchemaUtil and the Hive hack.

Both use "default" if no database is passed in. When passing in the database I was getting errors around unsupported characters in the database (even with back-ticks), likely because we're using AWSGlueDataCatalog as the store we're trying to convert. I'm not sure how EMR typically solves this, but by having those objects use spark.catalog.currentDatabase instead of default it "just worked."

The net net is that this may not be important generally as the problem is probably both related to Glue metastores and EMR.

rdblue · 2018-04-20T18:22:22Z

So it sounds like the problem was that those classes parse the table name instead of passing db and table separately? We'd definitely accept a PR that separated those options out and moved the single string parsing to a separate method. Supporting backticks would be great, too, but we generally expect to rely on Spark or a higher layer to provide that.

…lementation. (#27)

…lementation. (Netflix#27)

rdblue pushed a commit that referenced this issue Dec 18, 2018

Return FileAppender from Avro.WriteBuilder, not a package-private imp…

33776a8

…lementation. (#27)

Parth-Brahmbhatt pushed a commit to Parth-Brahmbhatt/iceberg that referenced this issue Apr 12, 2019

Return FileAppender from Avro.WriteBuilder, not a package-private imp…

93f8f18

…lementation. (Netflix#27)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Respect spark.catalog.currentDatabase instead of hardcoded "default" #27

Respect spark.catalog.currentDatabase instead of hardcoded "default" #27

NJordan72 commented Apr 13, 2018

rdblue commented Apr 18, 2018

NJordan72 commented Apr 19, 2018

rdblue commented Apr 20, 2018

Respect spark.catalog.currentDatabase instead of hardcoded "default" #27

Respect spark.catalog.currentDatabase instead of hardcoded "default" #27

Comments

NJordan72 commented Apr 13, 2018

rdblue commented Apr 18, 2018

NJordan72 commented Apr 19, 2018

rdblue commented Apr 20, 2018