[awq] replace scale when we have GELU #30074

SunMarc · 2024-04-05T16:26:50Z

What does this PR do ?

This PR replaces the scales by ScaleActivation when we have GELU activation. Before this PR, the scales calculated during the quantization were not loaded properly. To have a quick fix that doesn't depend on each model, I just check if we have the following layers layers = ["fc_in","dense_h_to_4h","up_proj","c_fc"] in the module that contains the GELU. This is done in order to get the shape of the scales. We need that to be able to load the scales, otherwise we will get an error saying that the shapes don't match.

Fixes #29421
Fixes #30225

HuggingFaceDocBuilderDev · 2024-04-05T16:55:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Thanks for working on this!

Two comments:

I'm worried that the current implementation will results in silent errors, as the logging is at the level of info. Hence, if some different module names than ["fc_in", "dense_h_to_4h", "up_proj", "c_fc"] are used, this issue will not be solved and we'll rely on another user flagging this.
Tests should be added to make sure the replacement happens as expected

amyeroberts · 2024-04-16T12:27:48Z

src/transformers/integrations/awq.py

+        if isinstance(module, nn.GELU) and name not in modules_to_not_convert:
+            # get the module just before applying act
+            # the easiest way is just checking if the mlp have these layers.
+            layers = ["fc_in", "dense_h_to_4h", "up_proj", "c_fc"]


This seems super brittle - let's do this in a more generalized way. In particular, this doesn't check these layers are used before applying the activation

Sounds good ! I decided to only modify impacted model in the end. While this is a bit rigid, I think it is fine for awq since the number of models type that are quantized is pretty small.

src/transformers/integrations/awq.py

TechxGenus · 2024-04-16T13:27:40Z

src/transformers/integrations/awq.py

@@ -145,6 +147,17 @@ def replace_with_awq_linear(

                # Force requires grad to False to avoid unexpected errors
                model._modules[name].requires_grad_(False)
+        if isinstance(module, nn.GELU) and name not in modules_to_not_convert:


See #30225 (comment).
May need to adapt to variants of GELU.

The lastest commit should include all affected models to this date.

amyeroberts

Thanks for adding this!

The only comment I'd make is to have a clearer name that replace_scales. scales isn't very descriptive - just from the method name, I don't really know what's happening, nor what a "scale" is

SunMarc added 2 commits April 5, 2024 18:24

fix awq test

186a320

style

d500860

Merge remote-tracking branch 'upstream/main' into fix-awq-test

a28212c

SunMarc requested a review from ArthurZucker April 9, 2024 09:20

add log

Loading
Loading status checks…

92613ca

younesbelkada requested a review from amyeroberts April 16, 2024 08:24

younesbelkada mentioned this pull request Apr 16, 2024

[BUG] Load StarCoder2 AWQ using Transformers #30225

Closed

4 tasks

amyeroberts reviewed Apr 16, 2024

View reviewed changes

ArthurZucker removed their request for review April 30, 2024 07:47

TechxGenus reviewed Apr 30, 2024

View reviewed changes

SunMarc added 2 commits May 7, 2024 13:39

Merge remote-tracking branch 'upstream/main' into fix-awq-test

Loading
Loading status checks…

f707223

new fix

Loading
Loading status checks…

8f263d6

SunMarc requested a review from amyeroberts May 7, 2024 13:55

SunMarc added 2 commits May 7, 2024 15:58

style

163bda1

only modifying impacted model in the end

dce81ee

amyeroberts approved these changes May 8, 2024

View reviewed changes

rename function

62d910f

SunMarc merged commit de6e0db into huggingface:main May 13, 2024
20 checks passed

johan12345 mentioned this pull request May 15, 2024

StarCoder2 AWQ does not work correctly huggingface/text-generation-inference#1899

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[awq] replace scale when we have GELU #30074

[awq] replace scale when we have GELU #30074

SunMarc commented Apr 5, 2024 •

edited by younesbelkada

Loading

HuggingFaceDocBuilderDev commented Apr 5, 2024

amyeroberts left a comment

amyeroberts Apr 16, 2024

SunMarc May 7, 2024 •

edited

Loading

TechxGenus Apr 16, 2024

SunMarc May 7, 2024 •

edited

Loading

amyeroberts left a comment

[awq] replace scale when we have GELU #30074

[awq] replace scale when we have GELU #30074

Conversation

SunMarc commented Apr 5, 2024 • edited by younesbelkada Loading

What does this PR do ?

HuggingFaceDocBuilderDev commented Apr 5, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Apr 16, 2024

Choose a reason for hiding this comment

SunMarc May 7, 2024 • edited Loading

Choose a reason for hiding this comment

TechxGenus Apr 16, 2024

Choose a reason for hiding this comment

SunMarc May 7, 2024 • edited Loading

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

SunMarc commented Apr 5, 2024 •

edited by younesbelkada

Loading

SunMarc May 7, 2024 •

edited

Loading

SunMarc May 7, 2024 •

edited

Loading