Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INTEGRATION] Add support for Trino #164

Open
julienledem opened this issue Aug 11, 2021 · 15 comments
Open

[INTEGRATION] Add support for Trino #164

julienledem opened this issue Aug 11, 2021 · 15 comments

Comments

@julienledem
Copy link
Member

No description provided.

@julienledem julienledem created this issue from a note in OpenLineage Roadmap 🚀 (Future) Aug 11, 2021
@harels
Copy link
Member

harels commented Feb 2, 2023

Trino Airflow extractor introduced here #1288

@harels harels closed this as completed Feb 2, 2023
@mobuchowski
Copy link
Member

There's interest in having standalone Trino integration, so I'm reopening the issue.

@mobuchowski mobuchowski reopened this Mar 8, 2024
@damian3031
Copy link

Does adding support for Trino also mean that dbt-trino adapter will be supported in openlineage-dbt?

@JDarDagran
Copy link
Contributor

The level of integration with OL-Trino depends on your needs. Here's a breakdown with an example:

Current Capabilities in Airflow

  • Basic Functionality: You can parse SQL statements, identify inputs and outputs.
  • Limited Metadata: However, retrieving detailed information about input and output datasets is currently possible only at the catalog level. This means you can only use SQL statements to query for that information.
    Potential Improvements:

Trino Plugin:

  • A future Trino plugin could be developed to retrieve more comprehensive information, such as the actual connection underlying each catalog.
  • dbt-trino Integration: If this plugin could be leveraged by the dbt-trino adapter, it would potentially enable similar support within dbt-trino.

Current dbt-trino Support:

  • While dbt-trino isn't directly supported at this moment, achieving the same level of information retrieval we have in Airflow might not require significant additional effort.

@mgorsk1
Copy link

mgorsk1 commented Mar 20, 2024

hey @JDarDagran @mobuchowski we are actually just started to look at this as well, maybe an idea would be to pull into OL integrations this plugin: https://github.com/takezoe/trino-openlineage? I've just revived and tested it with latest deps under this PR takezoe/trino-openlineage#1 and it looks like quite functional base.

@kacpermuda
Copy link
Contributor

kacpermuda commented Mar 21, 2024

Just chipping in here, but wouldn't it be best to have the OL integration where all other Trino plugins live, here: https://github.com/trinodb/trino/tree/master/plugin? I think that's the first place where users would look for it, and we could avoid potential compatibility problem like we had with Airflow when OL was only external package.

@JDarDagran
Copy link
Contributor

We can of course start with having the integration within OL repo and encourage Trino maintainers to add the code to their codebase further. Proving that the integration works and is being used will be for sure helpful to convince them.

@takezoe
Copy link

takezoe commented Mar 23, 2024

There was a similar discussion before in #1288. Sorry, I didn't have enough time to work on contributing trino-openlineage at that time. 🙇‍♂️

So what is the best direction now? Can I just send a pull request under https://github.com/OpenLineage/OpenLineage/tree/main/integration so that we will be able to improve it continuously in this project?

@mobuchowski
Copy link
Member

I agree that the best direction would be to contribute the plugin directly to Trino plugin repo: https://github.com/trinodb/trino/tree/master/plugin

@mgorsk1
Copy link

mgorsk1 commented Mar 25, 2024

+1, so the direction is clear. @takezoe if you are short on time we can cooperate to make this happen.

@alprusty
Copy link

We would be happy to collaborate as well, as we have done some POC and reference implementations as mentioned
in https://openlineage.slack.com/archives/C065PQ4TL8K/p1709843222415989

@mgorsk1
Copy link

mgorsk1 commented Mar 26, 2024

Perfect, I propse to move development collaboration efforts into trinodb/trino#21265

@takezoe
Copy link

takezoe commented Mar 26, 2024

Oh, you already created PR. Thank you. Let's see how Trino community react.

@mgorsk1
Copy link

mgorsk1 commented Mar 27, 2024

@alprusty can you share your thoughts & findings from the POC?

@alprusty
Copy link

alprusty commented Apr 1, 2024

@alprusty can you share your thoughts & findings from the POC?

@mgorsk1 I have added few review comments on the PR (trinodb/trino#21265)
Would be happy to collaborate on the PR with enhancements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

9 participants