-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Dataset Sidecar / Config prototype #1376
Draft
matbryan52
wants to merge
89
commits into
LiberTEM:master
Choose a base branch
from
matbryan52:dataset_sidecar
base: master
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6 tasks
13 tasks
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds a prototype configuration file format / parser / validator for LiberTEM objects. Responds to the needs in #768 (and others referenced in that issue).
The functionality is split into two (loosely coupled) parts:
tree.py
which handles parsing configuration files/strings (in TOML, JSON etc) into a tree structure, and implements some features like searching for keys in the tree and extracting sub-trees.models.py
which implements a set ofPydantic
models for input data validation / casting (https://docs.pydantic.dev/).There is a document describing the features in more detail in
schema_description.md
.The advantage of Pydantic is that it's quite easy to express complex input validation and default values: it uses Python typing to coerce and validate input data, with the added ability to specify extra / complex validators as a set of classmethods on the model. Originally I tried to do this with
jsonschema
but built up a lot of complexity in the schema definitions and the custom validation functions. Another nice feature is that the data is inserted into aTypedDict
-like structure with easy dot-access for consumers. A further benefit is that it also can give type-hints to an IDE under certain circumstances.The data models are sub-classable to make more specific schemas, for example:
to express the fact that
RawDataSet
requiresnav_shape
andsig_shape
, and additionally requires thedtype
argument.None of the functionality is integrated with any of the LiberTEM classes yet as we would have to modify some signatures (e.g. make
RawDataSet
accept a singlepath
argument withoutdtype
,nav_shape
,sig_shape
etc, so that these can be filled from the config information. An example of how this could be done is inexample.py
. Eventually we could use the data model itself to handle most input argument validation for a given class, even when not using a config file.Contributor Checklist:
Reviewer Checklist:
/azp run libertem.libertem-data
passed