Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pydantic should default to v2, or hera should allow it to be configured #984

Open
2 of 7 tasks
terrykong opened this issue Feb 26, 2024 · 6 comments
Open
2 of 7 tasks

Comments

@terrykong
Copy link

Pre-bug-report checklist

1. This bug can be reproduced using pure Argo YAML

If yes, it is more likely to be an Argo bug unrelated to Hera. Please double check before submitting an issue to Hera.

2. This bug occurs in Hera when...

  • exporting to YAML
  • submitting to Argo
  • running on Argo with the Hera runner
  • other: when using hera's pydantic models

Bug report

Describe the bug
A clear and concise description of what the bug is:
When mixing in hera models into my own pydantic models, I get errors like this:

         return cls.model_validate(kwargs, strict=True)
E       TypeError: BaseModel.validate() takes 2 positional arguments but 3 were given

Because I'm using the default in pydantic (V2 models), but hera defaults to the v1 models:

I think this is a bug b/c _PYDANTIC_VERSION gets set to 2 even though the imports are v1.

Error log if applicable:

To Reproduce
Full Hera code to reproduce the bug:

python -c 'from hera.workflows.models import Volume; print(Volume.__mro__)'
(<class 'hera.workflows.models.io.k8s.api.core.v1.Volume'>, <class 'hera.shared._pydantic.BaseModel'>, <class 'pydantic.v1.main.BaseModel'>, <class 'pydantic.v1.utils.Representation'>, <class 'object'>)

Expected behavior
A clear and concise description of what you expected to happen:

I think this should default to import pydantic.v2 first, but given that both versions are supported according to comments in _pydantic.py, I think this could also be a feature request in that there should be some variable like HERA_PYDANTIC_API that defaults to v2, but lets others choose v1 if they're still stuck on that version.

Environment

  • Hera Version: 5.14.0
  • Python Version: 3.X.X
  • Argo Version: 3.X.X

Additional context
Add any other context about the problem here.

@elliotgunton
Copy link
Collaborator

Hi @terrykong, thanks for adding the issue. The original PR #795 to add V2 has some history.

The problem is specifically that Hera's codebase was written using Pydantic V1, including the generated Argo model classes. This would require a fairly large time investment to provide models for both versions, and increase the maintenance burden for the two codepaths throughout the codebase (@samj1912 explored this initially but decided against it). Long term we do plan to migrate the codebase to V2, but it should not affect end users at this time.

I'm curious why you would need to mix Hera models with your own? As Hera builds workflows into yaml I'm not sure what missing feature your own V2 "mixin" models would be providing?

Also, the script runner feature lets you use Pydantic V1 or V2 models within your script template functions.

@elliotgunton elliotgunton added the note:needs-info We need more info from the issue submitter label Mar 5, 2024
@terrykong
Copy link
Author

Hi @elliotgunton

So in my use case, I was authoring configuration that used pydantic's BaseModel, but tried to add in hera types as fields, and ran into this issue. So something like this:

from hera.workflows.models import Volume
from pydantic import BaseModel

class MyConfig(BaseModel):
  foobar: str
  volume: Volume

MyConfig.model_validate({"foobar":"a", "volume": {"host_path": {"path": "/mnt"}}})
# Gives
'''
File "/.../venv/lib/python3.10/site-packages/pydantic/main.py", line 509, in model_validate
    return cls.__pydantic_validator__.validate_python(
'''

class H(BaseModel):
  path: str

class V(BaseModel):
  host_path: H

class MyConfigWhichDoesWork(BaseModel):
  foobar:str
  volume: V

MyConfigWhichDoesWork.model_validate({"foobar":"a", "volume": {"host_path": {"path": "/mnt"}}})
# And this works

So basically I want to use pydantic with hera's types but I think the v1 v2 API mismatch is causing pydantic's cryptic error message.

@elliotgunton
Copy link
Collaborator

Thank you @terrykong! Using the Hera classes as part of your configs makes sense 馃憤 The point remains that it will take a significant effort to allow both v1 and v2 classes to be exported from the library, so I can't give an expected timeframe/exact details for the time being.

@elliotgunton elliotgunton removed the note:needs-info We need more info from the issue submitter label Mar 12, 2024
@dev-goyal
Copy link

Seeing a potentially related error

hera generate yaml --to generated/ -e __pycache__/* --recursive _hera/
warning: The `hera` CLI is a work-in-progress, subject to change at any time!
Traceback (most recent call last):
  File "/Users/devgoyal/.pyenv/versions/argo/bin/hera", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/hera/__main__.py", line 26, in main
    cappa.invoke(Hera, argv=argv)
  File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/cappa/base.py", line 141, in invoke
    with resolved.get(concrete_output) as value:
  File "/Users/devgoyal/.pyenv/versions/3.11.7/lib/python3.11/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/cappa/invoke.py", line 65, in get
    result = callable(**finalized_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/hera/_cli/generate/yaml.py", line 35, in generate_yaml
    for workflow in load_workflows_from_module(path):
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/hera/_cli/generate/yaml.py", line 123, in load_workflows_from_module
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/Users/devgoyal/Development/argo/ml-features/dbt-features/_hera/dating_outcomes/rating_matches_daily_cron.py", line 84, in <module>
    "success": LifecycleHook(
               ^^^^^^^^^^^^^^
  File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for LifecycleHook
arguments
  value is not a valid dict (type=type_error.dict)
make: *** [Makefile:11: hera-generate] Error 1

From this code

        Step(
            name="post-processing-step",
            template_ref=matches_etl_ref,
            arguments=[
                Parameter(name="actions", value=["matches"]),
            ],
            hooks={
                "success": LifecycleHook(
                    arguments=[Parameter(name="metro", value="matches_etl")],
                    expression="steps['post-process-step']=='Succeeded'",
                    template_ref=metrics_ref,
                )
            },
        )

@elliotgunton
Copy link
Collaborator

Hey @dev-goyal - that error is unrelated as it's because LifecycleHook isn't yet a special Hera class. The error

arguments
  value is not a valid dict (type=type_error.dict)

is because you need to use from hera.workflows.models import Arguments to rewrite it as

    ...
    arguments=Arguments(
        parameters=[
            Parameter(name="metro", value="matches_etl"),
        ],
    ),
    ...

or as a dict without the import

    ...
    arguments={
        "parameters": [
            Parameter(name="metro", value="matches_etl"),
        ],
    },
    ...

@dev-goyal
Copy link

Hey @dev-goyal - that error is unrelated as it's because LifecycleHook isn't yet a special Hera class. The error

arguments
  value is not a valid dict (type=type_error.dict)

is because you need to use from hera.workflows.models import Arguments to rewrite it as

    ...
    arguments=Arguments(
        parameters=[
            Parameter(name="metro", value="matches_etl"),
        ],
    ),
    ...

or as a dict without the import

    ...
    arguments={
        "parameters": [
            Parameter(name="metro", value="matches_etl"),
        ],
    },
    ...

Sorry for the noise, that is correct!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants