Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable Validators #62

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Conversation

elijahbenizzy
Copy link
Collaborator

This allows us to configure validation on the per-node level. Done at config-time, this sets us up to determine how to add a validator from passed in configuration.

[Short description explaining the high-level reason for the pull request]

Changes

How I tested this

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

@elijahbenizzy
Copy link
Collaborator Author

Thoughts:

  1. Do we want to enable a per-data-quality check? Currently this is per-node. I like it this way as the API is simpler, but it could get annoying if just one check is breaking.
  2. What if there's a node named 'global'? (it shouldn't be a function, and can't be a kwarg...). So I'm ok with there not being one)
  3. Need to add more tests

Re: (1) I think we can add an extra field later for this (or add a subfield for check name) but otherwise I'm pretty happy with this. The tricky part is that checks don't have identifiable names...

@@ -137,7 +137,7 @@ it executes on every column that's extracted.
## Handling the results

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file needs to be merged into the main docs...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same with basics.md)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep was waiting for the docs rewrite -- was planning to chop this commit then copy/paste.

This allows us to configure validation on the per-node level.
Done at config-time, this sets us up to determine how to add a validator
from passed in configuration.
@elijahbenizzy elijahbenizzy force-pushed the configure-data-quality branch 2 times, most recently from 4f4b647 to 64fecf2 Compare February 28, 2023 01:24
This works both globally and locally. Currently its only at the node
level, but it shouldn't be to hard to disable specific validators/set
warnings on them.
Node-specific should go *before* global overrides. This fixes it.
@elijahbenizzy
Copy link
Collaborator Author

Ok, I'm going to think through this API this weekend -- feel more confident about the others so i might hold off on this for a bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants