Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[awq] replace scale when we have GELU #30074

Merged
merged 9 commits into from
May 13, 2024
Merged

Conversation

SunMarc
Copy link
Member

@SunMarc SunMarc commented Apr 5, 2024

What does this PR do ?

This PR replaces the scales by ScaleActivation when we have GELU activation. Before this PR, the scales calculated during the quantization were not loaded properly. To have a quick fix that doesn't depend on each model, I just check if we have the following layers layers = ["fc_in","dense_h_to_4h","up_proj","c_fc"] in the module that contains the GELU. This is done in order to get the shape of the scales. We need that to be able to load the scales, otherwise we will get an error saying that the shapes don't match.

Fixes #29421
Fixes #30225

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@SunMarc SunMarc requested a review from ArthurZucker April 9, 2024 09:20
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Two comments:

  • I'm worried that the current implementation will results in silent errors, as the logging is at the level of info. Hence, if some different module names than ["fc_in", "dense_h_to_4h", "up_proj", "c_fc"] are used, this issue will not be solved and we'll rely on another user flagging this.
  • Tests should be added to make sure the replacement happens as expected

if isinstance(module, nn.GELU) and name not in modules_to_not_convert:
# get the module just before applying act
# the easiest way is just checking if the mlp have these layers.
layers = ["fc_in", "dense_h_to_4h", "up_proj", "c_fc"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems super brittle - let's do this in a more generalized way. In particular, this doesn't check these layers are used before applying the activation

Copy link
Member Author

@SunMarc SunMarc May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good ! I decided to only modify impacted model in the end. While this is a bit rigid, I think it is fine for awq since the number of models type that are quantized is pretty small.

@ArthurZucker ArthurZucker removed their request for review April 30, 2024 07:47
@@ -145,6 +147,17 @@ def replace_with_awq_linear(

# Force requires grad to False to avoid unexpected errors
model._modules[name].requires_grad_(False)
if isinstance(module, nn.GELU) and name not in modules_to_not_convert:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #30225 (comment).
May need to adapt to variants of GELU.

Copy link
Member Author

@SunMarc SunMarc May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lastest commit should include all affected models to this date.

SunMarc added 2 commits May 7, 2024 13:39
@SunMarc SunMarc requested a review from amyeroberts May 7, 2024 13:55
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

The only comment I'd make is to have a clearer name that replace_scales. scales isn't very descriptive - just from the method name, I don't really know what's happening, nor what a "scale" is

@SunMarc SunMarc merged commit de6e0db into huggingface:main May 13, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Load StarCoder2 AWQ using Transformers Unused smoothing scales when loading AutoAWQ checkpoints
4 participants