Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better config support in ModelHubMixin #2001

Merged
merged 13 commits into from
Feb 22, 2024

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Jan 25, 2024

Fix #1750 following plan described in #1750 (comment).

In particular:

  • when saving model we serialize config either from input parameter or self.config (if exists). We support serialization both if config is a dict or a dataclass.
  • when loading model, we check if __init__ expects a config value. If it's the case, we check if the expected type from the annotation is a dataclass type, in which case we instantiate it correctly. Otherwise, we pass the raw dict.

cc @NielsRogge @LysandreJik could you have a look at the API? Once I have feedback, I'll move forward with the tests + docs. Thanks in advance! 🙏

With this PR, it is possible to do:

from dataclasses import dataclass

import torch
import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin


@dataclass
class Config:
    hidden_size: int = 512
    vocab_size: int = 30000


class MyModel(nn.Module, PyTorchModelHubMixin):
    def __init__(self, config: Config):
        super().__init__()
        self.param = nn.Parameter(torch.rand(config.hidden_size, config.vocab_size))
        self.linear = nn.Linear(4, 5)

    def forward(self, x):
        return self.linear(x + self.param)
>>> config = Config(hidden_size=256)
>>> model = MyModel(config)
>>> model.push_to_hub("my-small-model")

>>> reloaded = MyModel.from_pretrained("Wauplin/my-small-model")
>>> reloaded.config
Config(hidden_size=512, vocab_size=30000)

TODO:

  • add test
  • add documentation

⚠️ Breaking changes

This PR introduces a few breaking changes. We could be more conservative but given the low usage of ModelHubMixin I don't think we should be too conservative (IMO). Here is a list I can think of:

  • if self.config is an attribute of the model that cannot be serialized, it will fail. Prior to the PR, the config wouldn't be saved. With this PR, we'll try to serialize it to json which might fail. Alternative: try/except + warn on failure + ignore config.
  • if _from_pretrained need the config object but not __init__, we won't pass it anymore. Alternative: check if _from_pretrained expects it? (not sure it's worth it)
  • if __init__ does not expect a config input value in its declared parameters but only with a **kwargs, we won't forward it which means model instantiation might be different. Alternative: pass the config dict if __init__ accepts **kwargs. I feel that we would encourage a bad behavior by doing so.

Otherwise, I think that's it.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@NielsRogge
Copy link
Contributor

The API looks good to me. Just wondering whether users could pass the config to the init of the superclass rather than adding it specifically as an attribute (similar to how this is done in the Transformers library):

class MyModel(nn.Module, PyTorchModelHubMixin):
    def __init__(self, config: Config):
+       super().__init__(config)
-       self.config = config
        self.param = nn.Parameter(torch.rand(config.hidden_size, config.vocab_size))
        self.linear = nn.Linear(4, 5)

    def forward(self, x):
        return self.linear(x + self.param)

It's great that users don't need to pass the config argument anymore when using save_pretrained and push_to_hub. I assume those still work in case one has a regular Python dictionary?

@Wauplin
Copy link
Contributor Author

Wauplin commented Jan 26, 2024

Just wondering whether users could pass the config to the init of the superclass rather than adding it specifically as an attribute (similar to how this is done in the Transformers library):

But how would we be assured that the superclass accepts config as input? This is not guaranteed, right?

What we could do in ModelHubMixin.from_pretrained is to set self.config manually once the object has been created (and if self.config does not already exists). This way users can do whatever they want when subclassing ModelHubMixin/PytorchModelHubMixin and we would be guaranteed to have the config when saving to file/to Hub. WDYT? (feels a bit non-standard to do so but should work).

EDIT: made the change in 26e1c57. It's now possible to do:

from dataclasses import dataclass, asdict

import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin


@dataclass
class Config:
    ...


class MyModel(nn.Module, PyTorchModelHubMixin):
    def __init__(self, config: Config):
        ...

# MyModel.from_pretrained(...).config should be set (except if config doesn't exist)

@NielsRogge
Copy link
Contributor

Ok, that looks good to me.

But how would we be assured that the superclass accepts config as input? This is not guaranteed, right?

If you mean the nn.Module superclass, that class indeed doesn't support it, but we could make the Mixin class accept it in its init, although this wouldn't be backwards compatible, I assume.

And I assume this still works if your config is a regular Python dictionary, rather than a Dataclass? Hence we will support 2 use cases of the config, being Dataclass + Python dictionary?

For now, I've always used a regular Python dictionary, but then it wasn't that convenient as I couldn't do config.num_classes for instance. It's nice to access config argument as attributes.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a good idea to me. I would document it with examples of how to work with it though.

@julien-c
Copy link
Member

julien-c commented Feb 5, 2024

Linking to the related doc PR: huggingface/hub-docs#1196

Just to reiterate, we don't want to push the python mixin as the preferred way of uploading models given our preferred way is to have native library support (i.e., natively supported libraries are listed in hf.co/models filters) – but i agree this mixin is useful in some cases.

Let's try to get it more widely used @NielsRogge?

This PR looks good to me conceptually ✅

@julien-c
Copy link
Member

julien-c commented Feb 5, 2024

agree this mixin is useful in some cases

"library-less models" i guess we could call them?

@Wauplin
Copy link
Contributor Author

Wauplin commented Feb 8, 2024

Hey there 👋 Thanks for the feedback. I came back to this PR to clean it + improve documentation + add tests. PR is in its final state and ready to be reviewed IMO

@LysandreJik @NielsRogge mind having a second look at it? Thanks in advance! 🙏

And I assume this still works if your config is a regular Python dictionary, rather than a Dataclass? Hence we will support 2 use cases of the config, being Dataclass + Python dictionary?

Yes that's the case. It works both if your class expects a dataclass or a dictionary (and in a backward-compatible way).

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Played with it locally, works very well. Very extensive test suite!

Also I got to learn that a dataclass with non-typed parameters doesn't behave the same at all 😀

docs/source/en/guides/integrations.md Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
tests/test_hub_mixin.py Show resolved Hide resolved
Wauplin and others added 4 commits February 22, 2024 15:45
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: Lysandre Debut <hi@lysand.re>
…ggingface/huggingface_hub into 1750-access-config-in-model-hub-mixin
@Wauplin
Copy link
Contributor Author

Wauplin commented Feb 22, 2024

Thanks for the thorough review @LysandreJik! Clearly a TIL for me as well (still can't believe it totally...).
Merged your suggestions and now waiting for the CI to complete!

Copy link

codecov bot commented Feb 22, 2024

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (4de2e4f) 82.23% compared to head (c55ff07) 82.28%.

Files Patch % Lines
src/huggingface_hub/hub_mixin.py 95.83% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2001      +/-   ##
==========================================
+ Coverage   82.23%   82.28%   +0.04%     
==========================================
  Files          66       66              
  Lines        8309     8347      +38     
==========================================
+ Hits         6833     6868      +35     
- Misses       1476     1479       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Wauplin Wauplin merged commit 40315a0 into main Feb 22, 2024
16 checks passed
@Wauplin Wauplin deleted the 1750-access-config-in-model-hub-mixin branch February 22, 2024 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support passing config as dataclass in ModelHubMixin
5 participants