`FromData` implementation for `Option<T>` causes malformed bodies to be silently ignored. #2543

djrenren · 2023-05-18T16:39:28Z

Rocket Version: 0.5.0-rc.3

Description

FromData implementation for Option<T> causes malformed bodies to be silently ignored.

To Reproduce

Sometimes a route handler needs to support an optional request body (for example, when making a backwards-compatible addition to an endpoint that previously required no request body).

So the definition of such a route looks like so:

#[post("/my_route", data = "<body>")]
pub async fn my_route_handler(body: Option<Json<Foo>>) {
    /* ... */
}

The desired behavior is that, if no request data is present, we will get None for the body, and if some data is provided, we'll try to parse it. This is mostly how it works except, in the case where the body does not represent a valid Json<Foo> that is, Json::<Foo>::from_data returns a Failure result. Rather than a failure, we get None.

Expected Behavior

If a request body is supplied, it should be validated.

Environment:

OS Distribution and Kernel: all
Rocket Version: 0.5.0-rc.3

Additional Context

For prior art, we can look at what serde does. In general, serde follows the model that optional values are validated if supplied, so the following fails:

use serde::Deserialize;

#[derive(Debug, Deserialize)]
enum Foo {
    MyType
}

#[derive(Debug, Deserialize)]
struct Test {
    #[serde(default)]
    field: Option<Foo>
}

fn main() {
    serde_json::from_str::<Test>(r#"{"field": "garbage"}"#).unwrap();
}

Playground

That is, because a value was provided for field, an incorrect value results in an error, not a silent None.

It seems to me that rocket should follow this example because it provides stronger guarantees and more information.
If users want infallible parsing, they could use Result and just ignore any errors.

Proposal for updating FromData impl

Currently, the implementation of FromData for Option<T> clearly swallows all errors:

impl<'r, T: FromData<'r>> FromData<'r> for Option<T> {
    type Error = std::convert::Infallible;

    async fn from_data(req: &'r Request<'_>, data: Data<'r>) -> Outcome<'r, Self> {
        match T::from_data(req, data).await {
            Success(v) => Success(Some(v)),
            Failure(..) | Forward(..) => Success(None),
        }
    }
}

I propose this safer implementation of FromData:

/// Unsure how to write this function body cuz I'm not as familiar with rocket internals.
fn is_empty(data: &Data<'_>) -> bool { /* ... */}

impl<'r, T: FromData<'r>> FromData<'r> for Option<T> {
    type Error = T::Error;

    async fn from_data(req: &'r Request<'_>, data: Data<'r>) -> Outcome<'r, Self> {
        if is_empty(data) {
            return None;
        }

       T::from_data(req, data).await.map(Some)
    }
}

The text was updated successfully, but these errors were encountered:

SergioBenitez · 2023-05-18T19:53:57Z

I think what you're looking for is Option<Result<Json<Foo>, Error>>, where Error is Error.

djrenren · 2023-05-18T20:25:09Z

^I don't think in this workaround you even need the Option because the option will always be Some.

But I don't think using Result<Json<Foo>, Error> as a workaround is a viable solution. It would be impossible (at least I don't see a way) to tell if the error was the result of an empty body. We could take a guess based on the error like if it has an unexpected EOF. So we're back to having to ignore all manners of malformed input.

I think the only way around it is to take an Option<String> and then parse within the body of the function. Technically, even then you'd allow inputs with invalid strings, so you'd actually need to take an Option<Vec<u8>> and then parse the data using serde_json::from_slice within the body of the handler. Now your handler return type needs to accommodate this kind of parsing error that rocket normally just forwards to catchers.

All this to say, it's actually very hard to get this desired behavior of "Nothing or Exactly this". You basically have to create a new Option-like type that implements FromData like I've suggested above.

Overall it seems that ignoring malformed input is a bad default for Option<T>. If you do in fact want to silently ignore malformed inputs, then it's easy and clear to use Result<Json<Foo>, Error> and then in the handler say: let body = body.ok().

SergioBenitez · 2023-05-18T23:35:36Z

^I don't think in this workaround you even need the Option because the option will always be Some.

That's not quite the case. The Result guard forwards if T forwards. So Option would be None in the case of a forward from T.

But I don't think using Result<Json<Foo>, Error> as a workaround is a viable solution. It would be impossible (at least I don't see a way) to tell if the error was the result of an empty body.

Using Result here isn't a workaround; there isn't a problem we're trying to resolve. The semantics of a <T as FromData> is that it succeeds if T can be parsed from the request body data or otherwise it fails or forwards. What it means to "parse successfully" is obviously implementation specific. Identically, what it means to fail, and the error you get when a failure does occur, is also implementation specific. If an empty body is a failure, then it is up to the implementation to make that information available in its error value if it so wishes.

It would not make sense for Rocket to codify any meaning to any particular semantics for any particular case, failure or otherwise, and as it stands it does not. Just because an empty body is a failure for one type does not mean it would be a failure for another. Option<T> has very clear, general semantics as it stands: it is a Some(T) if T succeeds (for whatever that means for T) and None if it fails or forwards (for whatever that means for T). Result<T, E> has similarly clear and general semantics. Neither of these are bugs nor a result of overlooking behavior.

I think the only way around it is to take an Option<String> and then parse within the body of the function.

There's a much simpler solution that isn't a workaround and takes full advantage of Rocket: implement a custom, generic data guard that works the way you want it to. Let's call it Maybe<T, E>:

use rocket::request::Request;
use rocket::data::{self, Data, FromData};

enum Maybe<T, E> {
    Some(T),
    Err(E),
    Empty,
}

#[rocket::async_trait]
impl<'r, T: FromData<'r>> FromData<'r> for Maybe<T, <T as FromData<'r>>::Error> {
    type Error = std::convert::Infallible;

    async fn from_data(req: &'r Request<'_>, mut data: Data<'r>) -> data::Outcome<'r, Self> {
        if data.peek().await.is_empty() && data.peek_complete() {
            return data::Outcome::Success(Maybe::Empty);
        }

        match T::from_data(req, data).await {
            data::Outcome::Success(v) => data::Outcome::Success(Maybe::Some(v)),
            data::Outcome::Failure((s, e)) => data::Outcome::Success(Maybe::Err(e)),
            data::Outcome::Forward(data) => data::Outcome::Forward(data)
        }
    }
}

Even this type encodes particular functionality. You may wish to modify it to your liking. Now you can just use it:

#[post("/my_route", data = "<body>")]
pub async fn my_route_handler(body: Maybe<Json<Foo>, Error>) {
    /* ... */
}

djrenren · 2023-05-19T00:14:41Z

That's not quite the case. The Result guard forwards if T forwards. So Option would be None in the case of a forward from T.

Ah that's good to know in general, but Json never produces a forwards so in this case it'll always be Some.

there isn't a problem we're trying to resolve

It seems we're disagreeing on this point.

Just because an empty body is a failure for one type does not mean it would be a failure for another. Option has very clear, general semantics as it stands: it is a Some(T) if T succeeds (for whatever that means for T) and None if it fails or forwards (for whatever that means for T).

I have learned what Option does through experience not intuition, so I wouldn't exactly call it clear, but it is well defined. I'm presenting an alternately clear general semantic: it is None if the body is empty, otherwise it defers to T, mapping Success to Some.

Both my proposed semantics and the existing one are sound interpretations of what Option<T> could mean when referring to a request body. I argue that this one fails closed; it rejects all possible things the user could mean. While the existing one is more lenient. In the presence of ambiguity, the former can prevent unexpected behaviors because it fails fast.

Neither of these are bugs nor a result of overlooking behavior.

I'm just arguing that this change would improve the library. We can call it whatever we want.

I'm aware of the ability to make a custom type (seeing as I recommended an implementation of FromData for Option that encodes the behavior I'm suggesting.

SergioBenitez · 2023-05-19T01:53:45Z

I have learned what Option does through experience not intuition, so I wouldn't exactly call it clear, but it is well defined.

You're right. I'm not sure why this wasn't documented. I've pushed a commit (b6b060f) that documents all built-in guards, hopefully clearly.

It seems we're disagreeing on this point.

What I'm trying to say is that there isn't a bug here (besides arguably a doc bug, since this clearly wasn't documented well prior): there isn't something Rocket claims to do but fails to do, or something that Rocket clearly shouldn't be doing or is doing incorrectly. Instead, there's a disagreement about what Rocket should be doing.

As far as I understand, the disagreement rests on the what the semantics of Option<T> as a data guard should be. Either:

The existing functionality: infallible: Some(T) if T succeeds, None if T fails or forwards.
The proposed functionality: fallible: None if the request body is empty. Some(T) if the body is non-empty and T succeeds. Fail/forward if T fails/forwards, respectively.

There are a number of problems with the proposed functionality:

The implementation no longer depends solely on T's behavior.
This interpretation of Option<T> as a guard would not match the interpretation chosen for Option<T> as any other guard. That includes request guards (where there is no such concept as "missing" or "empty"), form guards (None if T fails. Note that by controlling strict/lenient parsing, an Option<T> can be made to be None on "empty" cases when desired, though "empty" is defined by T here, not Option), parameter guards (where we get None if T fails), and segments guards (None if T fails).
If T succeeds on empty cases, Option<T> will still return None.

If we wanted to change the interpretation of Option<T>, it would behoove us to do so in a universal manner. That is, change every guard implementation for Option<T> to agree with this new interpretation. This is problematic, of course, because the interpretation has no analog for request guards, and because this is a strictly limiting interpretation for form guards where strict/lenient parsing exists.

I think points 1) and 3) above make Option<T> harder to understand than it is today. Point 3) in particular is concerning: it would seem that if T succeeds, we should get a Some(T), but the proposed implementation makes that not the case. In other words, 3) is consequence of 1).

In all, this line of reasoning leads me to believe that Option<T> as it is implemented today is a more reasonable interpretation than the proposed interpretation. I would not be opposed to a shipping a guard that functions the way you propose, however, I just don't believe it's the right choice for Option given all of the above.

As an aside: another equally reasonable interpretation is to have Option<T> forward if T forwards xor have Option<T> fail if T fails. Failing on failure in particular is an interesting change as it would mean that Result and Option as data guards are strictly orthogonal. That is, Result catches only failures and Option catches only forwards: their composition Option<Result<T, E>> or Result<Option<T>, E> catches both. I could be convinced that this is a path worth pursuing. But this doesn't resolve your concern.

SergioBenitez · 2023-05-19T02:02:28Z

Just a thought: if we 1) change Option so that it only captures forwards and fails on failures, and 2) introduce a NonEmpty<T> guard that forwards on empty body data, fails if T fails, and otherwise succeeds if T succeeds, then Option<NonEmpty<T>> would provide the behavior you're looking for while having none of the drawbacks and being completely composable.

djrenren · 2023-05-19T16:27:47Z

I've pushed a commit (b6b060f) that documents all built-in guards, hopefully clearly.

Yeah this is fantastic! Thanks!

The implementation no longer depends solely on T's behavior.

I'm not sure this is a real problem, but I agree it's kinda cool viewing Option<impl FromData> as a combinator. I'm just not sure this definition of it guides people towards writing the best code.

This interpretation of Option as a guard would not match the interpretation chosen for Option as any other guard...

I'm less worried about this, especially in 0.5 where FromData is a completely disjoint type from FromRequest. I think it's okay, good even, to leverage the more specific context to do helpful things.

If T succeeds on empty cases, Option will still return None

This is a really interesting case. Good catch. I'd say I'd be fine with that behavior, and if your T does accept empty bodies, it just shouldn't be used with Option. That's definitely a matter of taste though.

That is, Result catches only failures and Option catches only forwards... But this doesn't resolve your concern.

I'm not sure this understanding of Option would be intuitive but it is kinda neat to think about.

The NonEmpty case is kinda neat and elegant in a theory way, but it feels like opt-in good behavior. In general, users are gonna write the simplest thing first and learn more about the library when it doesn't work. A quick github search shows that at least some devs are using Option<Json<..>> in their rocket apps for their request bodies:

https://github.com/search?q=language%3Arust+%22data%3A+Option%3CJson%22+rocket&type=code

I doubt most of them are intending to just ingest and ignore malformed data. Granted, that's really not a lot of data but searching github is hard and most web apps aren't publicly available so it's hard to estimate. Anyway, I think at this point it comes down to:

whether you think user's should ignore malformed data or not
If not, whether the current behavior is elegant enough to put up with some user confusion

I land pretty firmly on "they should not, and it's not" but ultimately that's a matter of taste.

djrenren added the triage A bug report being investigated label May 18, 2023

SergioBenitez added suggestion A suggestion to change functionality and removed triage A bug report being investigated labels May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`FromData` implementation for `Option<T>` causes malformed bodies to be silently ignored. #2543

`FromData` implementation for `Option<T>` causes malformed bodies to be silently ignored. #2543

djrenren commented May 18, 2023

SergioBenitez commented May 18, 2023 •

edited

djrenren commented May 18, 2023 •

edited

SergioBenitez commented May 18, 2023 •

edited

djrenren commented May 19, 2023

SergioBenitez commented May 19, 2023 •

edited

SergioBenitez commented May 19, 2023

djrenren commented May 19, 2023

FromData implementation for Option<T> causes malformed bodies to be silently ignored. #2543

FromData implementation for Option<T> causes malformed bodies to be silently ignored. #2543

Comments

djrenren commented May 18, 2023

SergioBenitez commented May 18, 2023 • edited

djrenren commented May 18, 2023 • edited

SergioBenitez commented May 18, 2023 • edited

djrenren commented May 19, 2023

SergioBenitez commented May 19, 2023 • edited

SergioBenitez commented May 19, 2023

djrenren commented May 19, 2023

`FromData` implementation for `Option<T>` causes malformed bodies to be silently ignored. #2543

`FromData` implementation for `Option<T>` causes malformed bodies to be silently ignored. #2543

SergioBenitez commented May 18, 2023 •

edited

djrenren commented May 18, 2023 •

edited

SergioBenitez commented May 18, 2023 •

edited

SergioBenitez commented May 19, 2023 •

edited