-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON Schema for pypyr YAML files #229
Comments
great idea @vlcinsky! 🎉 is your idea to create a model file for Pydantic somewhere like sadly there isn't a one-stop "valid" pipeline schema. . . but valid keys & structure is fully documented:
Also keep in mind that YAML (and also pypyr!) supports YAML references/anchors - I don't know how/if this works with a schema. |
@yaythomas thanks for hints. Last weekend i worked on it a bit. It is probably doable, but you are right it is extensive task and there are risks that it would not serve as well as it could. For that reason I am setting this task on hold on my side. My current attempt (in form of pytest test file) looks as follows: from typing import List, Mapping, Optional, Union, Any
import jsonschema
import pytest
import yaml
from pydantic import BaseModel, Field, constr
pymodule = constr(regex=r"^[a-z][a-z]*(\.[a-z][a-z0-9]*)*$")
substring = constr(regex=r".*{.+}.*") # expect at least one pair of {}
pystring = constr(regex=r"^!py .+") # line starting with '!py '
class Step(BaseModel):
name: str
description: Optional[str]
comment: Optional[str]
incontext: Mapping = Field(alias="in")
run: bool
skip: bool
swallow: bool
foreach: Optional[Union[List, substring, pystring]]
onError: Optional[Any]
class MainModel(BaseModel):
"""
This is the description of the main model
"""
context_parser: pymodule
steps: List[Union[pymodule, Step]]
on_success: Optional[List[Union[pymodule, Step]]]
on_failure: Optional[List[Union[pymodule, Step]]]
class Config:
title = "Main"
@pytest.fixture
def schema():
return MainModel.schema()
@pytest.fixture
def data():
fname = "tests/data/pipelinename.yaml"
with open(fname, "r", encoding="utf-8") as f:
data = yaml.safe_load(f)
return data
def test_schema(schema, data):
print(schema)
jsonschema.validate(schema=schema, instance=data) and the schema looks as follows: {
"title": "Main",
"description": "This is the description of the main model",
"type": "object",
"properties": {
"context_parser": {
"title": "Context Parser",
"pattern": "^[a-z][a-z]*(\\.[a-z][a-z0-9]*)*$",
"type": "string"
},
"steps": {
"title": "Steps",
"type": "array",
"items": {
"anyOf": [
{
"type": "string",
"pattern": "^[a-z][a-z]*(\\.[a-z][a-z0-9]*)*$"
},
{
"$ref": "#/definitions/Step"
}
]
}
},
"on_success": {
"title": "On Success",
"type": "array",
"items": {
"anyOf": [
{
"type": "string",
"pattern": "^[a-z][a-z]*(\\.[a-z][a-z0-9]*)*$"
},
{
"$ref": "#/definitions/Step"
}
]
}
},
"on_failure": {
"title": "On Failure",
"type": "array",
"items": {
"anyOf": [
{
"type": "string",
"pattern": "^[a-z][a-z]*(\\.[a-z][a-z0-9]*)*$"
},
{
"$ref": "#/definitions/Step"
}
]
}
}
},
"required": [
"context_parser",
"steps"
],
"definitions": {
"Step": {
"title": "Step",
"type": "object",
"properties": {
"name": {
"title": "Name",
"type": "string"
},
"description": {
"title": "Description",
"type": "string"
},
"comment": {
"title": "Comment",
"type": "string"
},
"in": {
"title": "In",
"type": "object"
},
"run": {
"title": "Run",
"type": "boolean"
},
"skip": {
"title": "Skip",
"type": "boolean"
},
"swallow": {
"title": "Swallow",
"type": "boolean"
},
"foreach": {
"title": "Foreach",
"anyOf": [
{
"type": "array",
"items": {}
},
{
"type": "string",
"pattern": ".*{.+}.*"
},
{
"type": "string",
"pattern": "^!py .+"
}
]
},
"onError": {
"title": "Onerror"
}
},
"required": [
"name",
"in",
"run",
"skip",
"swallow"
]
}
}
} Keep in mind, this is WIP, it is definitely not complete. |
Thanks so much @vlcinsky , this looks great! The regex-ing to make a custom "type" for a py module and pypyr-style Yes, agree with you, this is very much an extensive task. To illustrate this even more, we've not even talked about individual step inputs themselves yet. . . because from experience/feedback I've received in real-world usage, the places where most of the mistakes happen in yaml-authoring is actually more in the So what you've done already is a great start - if nothing else, I'm sure at some point it will be useful to someone at least to have the broad outline of the schema for the over-all structure like you've done here, even if the individual built-in steps are out of scope. As useful as a schema like this will be for those who like a more "full" IDE experience. . . there is actually another purpose too, to do validation of pipelines before/without actually running them, per ref #116. I've not fully (haha, or at all, really) thought through how this should work best, but you've definitely given me food for thought here that we might be able to serve both objectives at the same time. Thank you again! |
@yaythomas you are welcome. I am astonished by So far there can be two goals:
I would say that [authoring] could be feasible. [validation] could turn out being very difficult or impossible. If we aim for this, then we would need to use custom validators which would not be reflected in JSON schema. Handy links:
|
the links are very handy indeed, thank you! |
Having JSON schema for pypyr YAML configuration files would be handy. Many editors (at least my neovim) allow to configure runtime validation of authored file if JSON schema is available.
My question is - is there any JSON schema for pypyr already in place? My quick research did not reveal any.
If it does not exist, I am considering authoring one (using pydantic allows to create such a schema very simply).
The text was updated successfully, but these errors were encountered: