-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/check model columns have meta keys #216
base: main
Are you sure you want to change the base?
Feature/check model columns have meta keys #216
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #216 +/- ##
==========================================
+ Coverage 96.95% 96.96% +0.01%
==========================================
Files 56 59 +3
Lines 2624 2799 +175
Branches 350 381 +31
==========================================
+ Hits 2544 2714 +170
- Misses 59 64 +5
Partials 21 21 ☔ View full report in Codecov by Sentry. |
Hey @BAntonellini I made this pr for a usecase we are facing regarding our data catalog. Are you the person I should ask to have a look at it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice contribution @roydobbe, we're sure it'd be a great addition to Checkpoint hooks. Let's discuss some details before merging.
else True | ||
) | ||
} | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case the union from above will result in something other than Model
or ModelSchema
?
} | ||
else: | ||
continue | ||
seen = missing.get(model_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If L116
logic persists, couldn't model_name
be referenced before declaration? (undefined)
missing[model_name] = seen.union(missing_cols) | ||
elif missing_cols: | ||
missing[model_name] = missing_cols | ||
for model, columns in missing.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many details from L126 to L142:
- Couldn't results be calculated above, while traversing Model and ModelSchemas?
- Result string can surely be better handled (a multi-line template, f-strings, some auxiliary in
utils.py
, etc.) - Aren't we open to many KeyErrors? Take into account that dict's
get
method is almost always a better approach thanif key in dict
followed by direct[key]
access.
missing: Dict[str, Any] = {} | ||
meta_set = set(meta_keys) | ||
|
||
for item in itertools.chain(models, schemas): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see many DRY candidates from here until L113
Hey @BAntonellini I made some adjustments. Could you have a look? |
For data catalog purposes, we would like to have an option to check the columns for specific meta keys.
This function combines check_model_columns_have_desc and check_model_has_meta_keys