Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callbacks to Encoder/Decoder are not respected in datetime objects #669

Open
TheMythologist opened this issue Apr 19, 2024 · 2 comments
Open

Comments

@TheMythologist
Copy link

Description

Description

Both dec_hook and enc_hook arguments are not respected in all encoders and decoders (tested on JSON and YAML) when datetime objects are used. Note that the print functions in both hooks are not run, and the variable buf contains an ISO 8601 duration string instead of a number (as seen from enc_hook).

Attached is a sample script to show that custom decoding of datetime.timedelta objects is not supported. It also doesn't work for datetime.datetime objects.

import msgspec
from typing import Any, Type
from datetime import timedelta


def enc_hook(obj: Any) -> Any:
    print("Encoding")
    if isinstance(obj, timedelta):
        # convert the timedelta to a number
        return obj.total_seconds()
    else:
        # Raise a NotImplementedError for other types
        raise NotImplementedError(f"Objects of type {type(obj)} are not supported")


def dec_hook(type: Type, obj: Any) -> Any:
    print("Decoding", type)
    # `type` here is the value of the custom type annotation being decoded.
    if type is timedelta:
        # Convert ``obj`` (which should be a ``number``) to a timedelta
        return timedelta(seconds=obj)
    else:
        # Raise a NotImplementedError for other types
        raise NotImplementedError(f"Objects of type {type} are not supported")


class MyMessage(msgspec.Struct):
    field_1: str
    field_2: timedelta


enc = msgspec.json.Encoder(enc_hook=enc_hook)
dec = msgspec.json.Decoder(MyMessage, dec_hook=dec_hook)

msg = MyMessage("some string", timedelta(seconds=5))

# Doesn't work for JSON decoder
buf = enc.encode(msg)
print(buf)
a = dec.decode(buf)
print(a)

# Doesn't work for YAML decoders either
buf = msgspec.yaml.encode(msg, enc_hook=enc_hook)
print(buf)
a = msgspec.yaml.decode(buf, type=MyMessage, dec_hook=dec_hook)
print(a)
@TheMythologist
Copy link
Author

TheMythologist commented Apr 19, 2024

Update: This was broken sometime between version 0.16.0 and version 0.17.0.

Update: It was this specific commit that broke the hook for datetime.timedelta objects: 2b72ebb

Update: Seems like hooks for datetime.datetime objects were broken since the start

@wikiped
Copy link

wikiped commented Apr 24, 2024

.encode and .decode methods under the hood call msgspec.to_builtins and msgspec.convert functions respectively.

Both functions have parameter builtin_types, which disables processing of specified builtin types by the msgspec, but it does not pass those types to *_hook methods, only non-builtin types are passed to *_hooks.

Wether this is a bug or by design - only @jcrist can tell (no pun intended :-)
But it definitely feels like a bug.

The above can be illustrated with:

import msgspec as ms
import datetime as dt

def enc_hook(obj: Any) -> Any:
    print("Encoding")
    if isinstance(obj, T):
        return obj.name
    if isinstance(obj, dt.timedelta):
        # convert the timedelta to a number
        return obj.total_seconds()
    else:
        # Raise a NotImplementedError for other types
        raise NotImplementedError(f"Objects of type {type(obj)} are not supported")


class T:

    def __init__(self, name='some name'):
        self.name = name


class MyMessage(ms.Struct):
    field_1: T
    field_2: dt.timedelta


msg = MyMessage(T(), dt.timedelta(seconds=5))

msg_encoded = ms.to_builtins(
        msg,
        builtin_types=(
                dt.timedelta,
        ),
        enc_hook=enc_hook
    )

print(msg_encoded)

The above outputs:

Encoding
{'field_1': 'some name', 'field_2': datetime.timedelta(seconds=5)}

I can see 2 ways to overcome this behaviour until (if ever) it gets changed:

  1. Implement your own encode/decode method where you can control what happens to dict produced by msgspec before it gets sent to en/de-coders.
  2. Wrap builtin type in custom type to be handled by _hooks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants