Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement #[serde(other)] on enum variant #1382

Merged
merged 5 commits into from Sep 12, 2018

Conversation

roblabla
Copy link
Contributor

@roblabla roblabla commented Sep 10, 2018

Part of #912.

It might require more complex testing WRT non-unit other fields?

Untagged enums are not supported. I'm not quite sure the checks are correct, so if someone could make sure they make sense (they might be a bit too lax).

@roblabla roblabla changed the title Implement #serde(other) on enum variant Implement #[serde(other)] on enum variant Sep 10, 2018
@@ -1850,6 +1866,10 @@ fn deserialize_generated_identifier(
let ignore_variant = quote!(__other(_serde::private::de::Content<'de>),);
let fallthrough = quote!(_serde::export::Ok(__Field::__other(__value)));
(Some(ignore_variant), Some(fallthrough))
} else if other_idx.is_some() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be written as else if let Some(other_idx) = other_idx to avoid the unwrap below.

if i < variants.len() - 1 {
cx.error("#[serde(other)] must be the last variant");
}
}

// Variant with `other` attribute must be a unit variant.
(_, Identifier::Field, true) => {
// TODO: Only in field_identifier ?
(_, Identifier::Field, true, _) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What currently happens if you write serde(other) on a variant that is not unit?

@dtolnay
Copy link
Member

dtolnay commented Sep 10, 2018

Here are two cases that I would expect to work but do not. Do you think these would be possible to support?

#[derive(Deserialize, PartialEq, Debug)]
enum E {
    A,
    #[serde(other)]
    Other(String),
}

fn main() {
    assert_eq!(
        serde_json::from_str::<E>(r#" "foo" "#).unwrap(),
        E::Other("foo".to_owned()),
    );
}
#[derive(Deserialize, PartialEq, Debug)]
struct Artifact {}

#[derive(Deserialize, PartialEq, Debug)]
enum Message {
    CompilerArtifact(Artifact),
    #[serde(other)]
    Unknown,
}

fn main() {
    assert_eq!(
        serde_json::from_str::<Message>(r#" {"CompilerMessage":[]} "#).unwrap(),
        Message::Unknown,
    );
}

@roblabla
Copy link
Contributor Author

🤔 this is a bit of a puzzle. The first one IMO doesn't make sense: it's deserializing an externally tagged enum, so we expect something in the form { "Variant": data }, but instead getting "data". I expect this to fail. If you still think it should work, I can look into it though.

The second one should work though, so I probably messed up somewhere! (I mostly tested on internally tagged enums for my use-case ^^'). I'll take a look.

@dtolnay
Copy link
Member

dtolnay commented Sep 10, 2018

I would expect the first one to work by analogy with how E::A is represented: playground link.

@roblabla
Copy link
Contributor Author

roblabla commented Sep 10, 2018

Ah, so this is quite different from how I originally envisioned this to work. I thought the data contained in the Other variant would be the "inner" object, e.g. for an externally tagged enum { "UnknownVariant": "data" } => Enum::Other("data").

So here's a proposal:

#[derive(Deserialize)]
enum Test {
    A,
    #[serde(other)]
    B(#[serde(other_tag)] String, serde_json::Value)
}

The field equipped with #[serde(other_tag)] (bikeshed for a better name welcome) would automatically be populated with the tag name. (Obviously, it doesn't have to be a String, anything deserializable should work).

Then, we'd get { "Unknown": "data" } => Test::B("Unknown", Str("data")).

The upside is that we can keep parsing data even in the Unknown case (maybe we can still get some useful data that we know are guaranteed to always be present regardless of the enum ?) while having a consistent way to access the variant name.

Would this work?

@dtolnay
Copy link
Member

dtolnay commented Sep 10, 2018

I guess my confusion is in when you would ever want to capture the data of an unknown variant but not its variant name. I would expect the default least-effort way of using serde(other) would capture both of those or neither of those. Skimming through the use cases in #912, either the other variant is a unit variant and captures nothing, or is a newtype variant that captures everything necessary for round-tripping the input data through that Rust data structure, which includes the unrecognized tag. In your suggestion involving serde(other_tag) the default least-effort use of serde(other) (so without other_tag) captures the data but not the tag, which is not a behavior that has been requested.

@roblabla
Copy link
Contributor Author

roblabla commented Sep 11, 2018

Alright, here's my reasoning:

As a user, I want the ability to access both the data and the tag in a simple way. Because of this, I want Other to behave like the other enum variant, because that's the intuitive thing. Here's an example: we have an API that sends requests following this schema:

{
    "action": "some_action",
    "version": "1.0.0",
    "field_of_some_action": "1",
    "field_of_some_action2": "2",
}

(this is an internally tagged enum). The API contract says version will always be present. As such, it might be useful for the Other variant to have the ability to parse it, in order to inform the user of what the latest version is. This is a totally made-up example, but it's a pattern that I've seen in the past.

So, how would I parse this API?

#[derive(Deserialize, PartialEq, Debug)]
#[serde(tag = "action")]
enum E {
    Action1, // whatever
    #[serde(other)]
    UnknownAction { version: String },
}

fn main() {
    assert_eq!(
        serde_json::from_str::<E>(r#" "{ "action": "unknown", "version": "2.0.0" }" "#).unwrap(),
        E::UnknownAction { version: "2.0.0" }),
    );
}

Now, obviously, in many/most cases, we'd still want the ability to access the tag. For this case, we need to put it somewhere in that UnknownAction enum variant. The question is: where? I can only come up with two ideas:

  1. Have an attribute defining where. That's #[serde(other_tag)]
  2. Enforce that the first element of the other variant has to be a String, and will be the enum.
  3. "Reverse" the problem and put a tag on data fields? I'm not sure how that would even look. UnknownAction { #[serde(data_field)] version: u32, tag: String }

The first felt the most intuitive to me. With this case, in order to access the action field above, we'd do this:

#[derive(Deserialize, PartialEq, Debug)]
#[serde(tag = "action")]
enum E {
    Action1, // whatever
    #[serde(other)]
    UnknownAction { #[serde(other_tag)] action: String, version: String },
}

fn main() {
    assert_eq!(
        serde_json::from_str::<E>(r#" "{ "action": "unknown", "version": "2.0.0" }" "#).unwrap(),
        E::UnknownAction { action: "unknown", version: "2.0.0" }),
    );
}

I'm still trying to figure out if this is even implementable though.


Maybe I could just enforce that the Other is a unit struct for now, and then we can discuss what the proper course of action is for accessing the tag/data in a separate issue and follow-up PR ?

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks like the right behavior for unit variants.

@dtolnay dtolnay merged commit 2753ec7 into serde-rs:master Sep 12, 2018
@derekdreery
Copy link
Contributor

derekdreery commented Sep 26, 2018

Does this implement the following use-case:

Data shape:

{
    "errors": ["ErrorKind1", "ErrorKind2", ...]
}

So we get 0 or more errors.

Rust code

#[derive(Deserialize)]
struct Result {
    errors: Vec<ErrorKind>
}

enum ErrorKind {
    ErrorKind1,
    ErrorKind2,
    // ...
    #[serde(other)]
    Unrecognised(String),
}

That is, I receive a list of errors, and most of the time I can strongly type them, but on occasion I don't recognize the string, and so fall back to just storing the string value.

This seems to be different to the use-case discussed in the PR.

@roblabla
Copy link
Contributor Author

roblabla commented Sep 26, 2018

The pr only allows the #[serde(other)] variant to be unit-style. A future pr could extend it to allow recovering the tag. From my testing, however, there are several problems that don't have am obvious solution:

  • The enums can parse their discriminant as either String or usize. How do we deal with unknown numeric tags in the case of Unknown(String), and conversely, deal with string tags in the case of Unknown(num)?

  • For some of my use-cases, I need the ability to parse data in the other tag. I gave some examples in my last comment above. How should we deal with that?

@derekdreery
Copy link
Contributor

A conservative approach would be to just reject the parse if the discriminant type and the extra type don't match, and just support numbers and strings. Then the functionality you mention in your 2 points could be added incrementally, whilst retaining backwards compatibility.

@derekdreery
Copy link
Contributor

I'll have a go at implementing this and see what it looks like.

@spease
Copy link

spease commented May 8, 2019

Guess this has gotten stalled?

@nixpulvis
Copy link

nixpulvis commented Feb 22, 2021

The pr only allows the #[serde(other)] variant to be unit-style.

This gets me started, thanks.

I'm using serde_json and I'm finding myself wanting to just store a generic Value inside the other variant. I'm guessing this use case is pretty common across formats.

EDIT: Using an Other(serde_json::Value) inside a #[serde(untagged)] seems to be working for this case, since it checks in order.

@maxcountryman
Copy link

EDIT: Using an Other(serde_json::Value) inside a #[serde(untagged)] seems to be working for this case, since it checks in order.

What if I have a tag I want but also want to have a blanket Other variant (specifically when there's a tag that I have intentionally not provided as an enum variant)? I'm not seeing how these two can work together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants