Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"unknown" option should apply to nested fields so that "exclude" works #1490

Open
psverkada opened this issue Jan 14, 2020 · 15 comments · May be fixed by #1575
Open

"unknown" option should apply to nested fields so that "exclude" works #1490

psverkada opened this issue Jan 14, 2020 · 15 comments · May be fixed by #1575
Labels

Comments

@psverkada
Copy link

The fact that exclude takes dotted notation for nested fields, but unknown does not apply to nested fields, means that exclude doesn't work for nested fields in some common use cases unless the nested schema(s) have the desired Meta.unknown setting, and then they are not runtime-flexible, which is the great benefit of Schema and load taking exclude. Somewhat new to marshmallow so maybe there's a way to accomplish this (setting exclude on nested fields at load time) in a different way?

Happy to contribute a PR if you guys think this is a good idea.

@deckar01
Copy link
Member

Can you provide a reproducible example of this issue? Are you getting errors about unknown nested fields when you are expecting them to be excluded?

@deckar01
Copy link
Member

Schema and load taking exclude

The Schema constructor takes an exclude argument, but the load() method doesn't.

https://marshmallow.readthedocs.io/en/stable/api_reference.html#marshmallow.Schema.load


I tried building a repro case from the description. Let me know if I missed anything.

from marshmallow import Schema, fields, EXCLUDE

class Foo(Schema):
    data = fields.Str()

class Test(Schema):
    foo = fields.Nested(Foo)

schema = Test(exclude=['foo.data'], unknown=EXCLUDE)
schema.load({'foo': {'data': 'bar'}})
# marshmallow.exceptions.ValidationError: {'foo': {'data': ['Unknown field.']}}

In this example, unknown=EXCLUDE does not get inherited by Foo, so it will continue to raise when it encounters unknown keys. exclude=['foo.data'] propagates down to Foo and removes data from its internal list of fields. When Foo tries to load data it is treated as an unknown key and raises an error.

I initially though this was expected behavior, but the docs for exclude say it contains "fields to exclude in the serialized result". This seems to indicate that the excluded field shouldn't be treated as unknown, but just be excluded from the serialized object. It seems like exclude should take precedence over unknown.

https://marshmallow.readthedocs.io/en/stable/api_reference.html#marshmallow.Schema

A minimal repro case for this can be constructed without Nested or unknown.

from marshmallow import Schema, fields

class Foo(Schema):
    data = fields.Str()

schema = Foo(exclude=['data'])
schema.load({'data': 'bar'})
# marshmallow.exceptions.ValidationError: {'data': ['Unknown field.']}

@lafrech
Copy link
Member

lafrech commented Jan 15, 2020

IIUC, behaviour is correct and conforms to the documentation:

exclude – Blacklist of the declared fields to exclude when instantiating the Schema.

The field is excluded, so treated as unknown.

Indeed, the "fields to exclude in the serialized result" phrasing is not precise enough and could be reworded. The part I quote above is clearer.

The issue seems to be that unknown does not propagate. This was already discussed here and we settled on "don't propagate". Generally, people will use a base schema for that.

@deckar01
Copy link
Member

deckar01 commented Jan 15, 2020

I copied from Meta.exclude, but linked to Schema(exclude). The constructor parameter definition is clear and matches the implementation. The Meta.exclude definition should probably be fixed to match that.

@deckar01 deckar01 added the docs label Jan 15, 2020
@psverkada
Copy link
Author

psverkada commented Jan 15, 2020 via email

@deckar01
Copy link
Member

Discussion on proposals to recursively inherit unknown can be found on #980 and #1428. Any suggestions or PRs for updating the docs would be appreciated.

@psverkada
Copy link
Author

psverkada commented Jan 16, 2020 via email

@deckar01
Copy link
Member

Seeing how many times this definition is duplicated, and considering how we just noticed the definitions for exclude were out of sync, we might want to consider defining once, and referencing with links. The restructured text code reference syntax should make this fairly painless.

@mahenzon
Copy link
Contributor

mahenzon commented Apr 23, 2020

Hi there! We definitely need to implement this functionality. Here's a live example:

from marshmallow import Schema, fields, EXCLUDE, INCLUDE


class SpamSchema(Schema):
    meat = fields.String()


class CanSchema(Schema):
    spam = fields.Nested(SpamSchema)


def main():
    can = CanSchema(unknown=EXCLUDE)

    # just passing expected data
    data = can.load({
        'spam': {'meat': 'pork'},
    })
    print(data)
    # prints {'spam': {'meat': 'pork'}}

    # passing unknown field at the top level
    data = can.load({
        'spam': {'meat': 'pork'},
        'foo': 'bar'
    })
    print(data)
    # prints {'spam': {'meat': 'pork'}}

    # or with include
    data = can.load(
        {
            'spam': {'meat': 'pork'},
            'foo': 'bar'
        },
        unknown=INCLUDE,
    )
    print(data)
    # prints {'spam': {'meat': 'pork'}, 'foo': 'bar'}

    # now passing unknown in a nested field
    can.load({
        'spam': {
            'meat': 'pork',
            'add-on': 'eggs',
        },
    })
    # fails with marshmallow.exceptions.ValidationError: {'spam': {'add-on': ['Unknown field.']}}

    # even if stating directly, it does not help
    can.load({
        'spam': {
            'meat': 'pork',
            'add-on': 'eggs',
        },
    }, unknown=EXCLUDE)
    # fails with marshmallow.exceptions.ValidationError: {'spam': {'add-on': ['Unknown field.']}}


if __name__ == '__main__':
    main()

@sirosen
Copy link
Contributor

sirosen commented May 15, 2020

Looking at this issue from the perspective of webargs, I see two (related) issues with #1575. I think I have a good idea of how to resolve them, but want to see if people like the results before I try to write it.

And then, as an ancillary concern, if someone is writing a library on top of marshmallow (👋 hi guys!), they'll want to be able to control this or offer control to their users.

Maybe backwards incompatibility isn't an issue in this case -- if it's considered a bug that's being fixed -- but I think it will be a problem for someone.


So my idea:

  • add a new parameter in all of the same places as unknown, propagate_unknown
  • propagate_unknown can be True or False (by default, for backwards compat) and can be removed (True everywhere) in marshmallow v4
  • if propagate_unknown is True, behave as in Propagate unknown parameter to nested fields #1575 when unknown is passed to load, and if False, behave as today

I would take #1575 as a basis for this work. It looks solid, has nice tests, and credit where it's due -- I just don't quite agree with the behavior.

The downsides are that we have a new attribute to control unknown, which might feel messy, and that we don't get the desired behavior by default. To me, those are worthwhile in the first case and a necessary evil in the second.

You can do it all with just one attribute, but I promise that it's worse.

This is just for fun. Don't take this suggestion seriously. Please. 😄

We could change the constants a bit...

RAISE = 1
INCLUDE = 2
EXCLUDE = 4
PROPAGATE = 8
...
schema.load(data, unknown=EXCLUDE|PROPAGATE)

but then we're just asking for trouble because you have to check for INCLUDE|EXCLUDE, which makes no sense.

sirosen added a commit to sirosen/marshmallow that referenced this issue Jul 17, 2020
The changelog entry, including credit to prior related work, covers
the closed issues and describes how `propagate_unknown` is expected to
behave.

Inline documentation attempts to strike a balance between clarify and
brevity.

closes marshmallow-code#1428, marshmallow-code#1429, marshmallow-code#1490, marshmallow-code#1575
sirosen added a commit to sirosen/marshmallow that referenced this issue Jul 17, 2020
The changelog entry, including credit to prior related work, covers
the closed issues and describes how `propagate_unknown` is expected to
behave.

Inline documentation attempts to strike a balance between clarify and
brevity.

closes marshmallow-code#1428, marshmallow-code#1429, marshmallow-code#1490, marshmallow-code#1575
@wanderrful
Copy link

wanderrful commented Sep 3, 2020

Here's another demonstration of this issue, as I just bumped into it as well here:

from dataclasses import dataclass
import desert  # similar to marshmallow-dataclass
import marshmallow
from typing import Optional

@dataclass
class Address:
    city: Optional[str]
     
@dataclass
class User:
    name: Optional[str]
    address: Address
    
>>> user1_data = { "name": "johnny", "address": { "city": "Seoul", "line": "21 Noksapyeong-daero 3-gil" } }

# desert.schema() just returns a marshmallow.Schema
>>> schema = desert.schema(User, meta={"unknown": marshmallow.EXCLUDE, "partial": True})

>>> schema.load(user1_data)

yields...

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/.../lib/python3.8/site-packages/marshmallow/schema.py", line 723, in load
    return self._do_load(
  File "/.../lib/python3.8/site-packages/marshmallow/schema.py", line 905, in _do_load
    raise exc

marshmallow.exceptions.ValidationError: {'address': {'line': ['Unknown field.']}}

But of course it works for the first "layer" of nesting, because that's where we specifically made the Schema:

>>> user2_data = {"name": "bobby", "last": "hill", "address": { "city": "Suwon" } }
>>> schema.load(user2_data)

User(name='bobby', address=Address(city='Suwon'))

The workaround that @sirosen suggested in his documentation proposal (#1634) wouldn't really work for this use case as far as I can tell, and so it would be a design constraint for marshmallow extensions if a change is not included in the base Marshmallow itself, as we're not able to make use of schema inheritance when dataclasses are involved as far as I know. I'm hopefully just ignorant of something, though.

Best case scenario, third party extensions would just have to find some way to cope with their Schema wrapper classes like with the one shown above.

sirosen added a commit to sirosen/marshmallow that referenced this issue Sep 4, 2020
The changelog entry, including credit to prior related work, covers
the closed issues and describes how `propagate_unknown` is expected to
behave.

Inline documentation attempts to strike a balance between clarify and
brevity.

closes marshmallow-code#1428, marshmallow-code#1429, marshmallow-code#1490, marshmallow-code#1575
@mahenzon
Copy link
Contributor

any news here?

@deckar01
Copy link
Member

deckar01 commented Oct 12, 2020

There was some discussion on the PR based on #1490 (comment). My recommendation was #1634 (comment).

@mahenzon
Copy link
Contributor

I like solution suggested by @sirosen in #1490. Patched marshmallow in my project with propagate_unknown

@caio-vinicius
Copy link

For the people looking how to solve the issue, this helps me:

"user": fields.Nested(
{"id": fields.Int(required=True),
"login": fields.Str(required=True),
"url": fields.Str()
},
required=True,
unknown=EXCLUDE),

Adding unknown=EXCLUDE inside the fields.Nested to make sure that the Nested field ignore other data coming from the request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants