-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow materializer targets to specify inputs #713
Labels
Comments
Yep that's why I opened up this issue, cause I think it should work! It's a nice way to use materializers and track them. To avoid pains/encourage early failures we can add a validate_inputs param and default it to true. |
skrawcz
pushed a commit
that referenced
this issue
May 21, 2024
Following #877 Added the ability to add materializer nodes to the `FunctionGraph` at `Driver` build time. Use cases: - Easier to view materializers as part of `.display_all_functions()` - Allows to call materializers by name with `.execute()`. Would allow some users to completely ignore `.materialize()` - Possible to combine "static" materializers at build time and "dynamic" materializers at execution time - Catch invalid dataflows earlier than `dr.materialize()` ## Changes - Builder now accepts `.with_materializers(*materializers)` - `Builder.build()` adds materializer nodes after creating the `Driver` and returns the updated Driver. The logic is derived from `Driver.materialize()` - `Builder.copy()` copies the materializers ## How I tested this - Added tests to check materializer nodes (savers and loaders) are properly added - test that "static" and "dynamic" materializers can be used together ## Notes - each time materializers are added `post_graph_construct` hooks are triggered - Should be include materializers in the `version` hashing of the dataflow? should we treat "static" and "dynamic" differently. IMO, we might want two create two versions: one for "the dataflow transform, ignoring materializers" and one for "all nodes" since they answer different equality / diffing questions - there's duplicate logic between `Builder.build()` and `Driver.materialize()`. A better approach could be to have `Driver.add_materializers()` and `Driver.execute_materializers()`. Both `Builder.build()` and `Driver.materialize()` could call `Driver.add_materializers()` - Related to #713, users need to be aware that materializers need a valid `target` or `dependencies` (type and name) to connect to. Squashed commits of: * added Builder.with_materializers() * moved logic to Driver __init__ * added type annotation * made materializers an internal attribute * updated docs; pre-commit hook; typing bugfix --------- Co-authored-by: zilto <tjean@DESKTOP-V6JDCS2>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
I want to be able to use materializers to create simple ETLs of just moving data from one place to another.
Describe the solution you'd like
A little odd but a good use-case. Specifically, I want something like this:
This should form a two-node DAG.
Describe alternatives you've considered
Just writing a function to do this, or doing it outside of Hamilton.
The text was updated successfully, but these errors were encountered: