Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lutra roadmap #4177

Open
2 of 11 tasks
aljazerzen opened this issue Feb 6, 2024 · 5 comments
Open
2 of 11 tasks

Lutra roadmap #4177

aljazerzen opened this issue Feb 6, 2024 · 5 comments
Labels

Comments

@aljazerzen
Copy link
Member

aljazerzen commented Feb 6, 2024

What's up?

After the initial implementation in #4134 has merged, there is still a lot of things to implement for Lutra. Tasks in order of priority:

  • Python bindings, so the results of queries can be use in with Pandas/Polars (underway in feat: lutra python bindings #4174, I need help with devops),
  • test the CLI,
  • reuse connections between executed queries (easy),
  • define a proper lutra error type (medium),
  • generate database module definition (hard, requires connector_arrow support) feat(lutra): pull database schema into source #4182
  • support for @lutra.duckdb (medium),
  • support for @lutra.postgres (medium),
  • CLI execute option to write results to a dir,
  • use a connection pool to make execution parallel (medium),
  • reuse connections between lutra invocations (requires daemon, hard),
  • support for multi-database queries,
@aljazerzen
Copy link
Member Author

aljazerzen commented Feb 6, 2024

Regarding WASM support: it is limited by upstream library support. Currently no data source libraries used by connector_arrow compile for wasm32-unknown-unknown, but it seems like rusqlite is close.

When that is done, lutra still won't compile as it needs access to a file system to discover the project. I've specifically made sure that discover module is standalone and could be hidden behind a feature. In this configuration, we could make a js library lutra-wasm that accepts already-discovered project that is stored somewhere else in browser memory.

@max-sixty
Copy link
Member

Overall, great!

When that is done, lutra still won't compile as it needs access to a file system to discover the project. I've specifically made sure that discover module is standalone and could be hidden behind a feature. In this configuration, we could make a js library lutra-wasm that accepts already-discovered project that is stored somewhere else in browser memory.

Yes! Or a separate function could do the collection and pass to lutra-wasm as a string...

@eitsupi
Copy link
Member

eitsupi commented Feb 9, 2024

I'm not sure how lutra works, but am I correct in assuming that it automatically recognizes the schema of tables?
Do you have plans to experiment with behaviors that would not be possible without the schema like #3133?

@aljazerzen
Copy link
Member Author

It does have this capability (see https://github.com/aljazerzen/connector_arrow/blob/main/connector_arrow/src/api.rs for what connector_arrow supports) and my approach of passing this information to the prqlc is to generate type definitions in PRQL.

Naive approach would be for lutra to pass schema information to prqlc directly in some internal representation, but that would couple lutra and prqlc very tightly. Instead, I added pull-schema command (I don't have a better link for examples, I need to add CLI tests) to lutra, and prqlc can then work with type definitions directly.

TLDR; lutra will allow pulling schema into PRQL source, which will avoid some compiler problems, which will allow us to say "in this case, compiler may error out and say it needs more schema info".

@eitsupi
Copy link
Member

eitsupi commented Feb 9, 2024

I think that approach seems like the best choice given the current PRQL behavior of working without a schema and compiling to substrait, etc., which would not be possible without schema information. Great job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants