Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] Fix protocol::FileFormat::ORC mapping to use dwio::common::FileFormat::ORC #22726

Merged
merged 1 commit into from May 27, 2024

Conversation

wypb
Copy link
Contributor

@wypb wypb commented May 13, 2024

Description

Iceberg's ORC format should be mapped to Velox's ORC format, not the DWRF format.

Because the open source Velox does not currently support reading ORC files (some adaptation is required), adding tests to read ORC will fail on the native side. So I didn't add the test to read ORC.

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== NO RELEASE NOTE ==

@wypb wypb requested a review from a team as a code owner May 13, 2024 06:21
@wypb wypb force-pushed the fix_iceberg_orc branch 2 times, most recently from a5194dc to 4fa7aba Compare May 14, 2024 06:59
Copy link

linux-foundation-easycla bot commented May 14, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: wypb / name: wypb (96f637c)

@wypb wypb force-pushed the fix_iceberg_orc branch 2 times, most recently from 3202320 to 2df47c3 Compare May 14, 2024 07:08
@tdcmeehan tdcmeehan self-assigned this May 14, 2024
Copy link
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wypb The queries all failed, have you taken a look what caused it? Was it Iceberg related issue?

@wypb
Copy link
Contributor Author

wypb commented May 23, 2024

No, This is due to the fact that the open source Velox does not support reading ORC files(some adaptation is required: facebookincubator/velox#6588). I made an adaptation internally, and I need to remove this part of the code in this PR.

@majetideepak majetideepak changed the title [native] Fix file format mapping error for Iceberg Connector. [native] Fix protocol::FileFormat::ORC mapping to use dwio::common::FileFormat::ORC May 25, 2024
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wypb This is not limited to the Iceberg connector correct? This is a general protocol::FileFormat mapping issue.
Can you update the commit title and add a description?

@majetideepak
Copy link
Collaborator

I am also assuming the current mapping is a bug.

@wypb
Copy link
Contributor Author

wypb commented May 27, 2024

This is not limited to the Iceberg connector correct?

@majetideepak Currently, toVeloxFileFormat(const presto::protocol::FileFormat format) is only used by the Iceberg connector, but it is not ruled out that other connectors will also use it in the future.

Can you update the commit title and add a description?

I've updated the commit title and added a description, Thank you.

…ileFormat::ORC

Currently, Prestissimo's toVeloxFileFormat method maps the ORC file
format to Velox's DWRF file format, and an error will occur when
reading the Iceberg table in the ORC file format.
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @wypb

@majetideepak majetideepak merged commit 2625380 into prestodb:master May 27, 2024
59 checks passed
@wypb wypb deleted the fix_iceberg_orc branch May 27, 2024 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants