Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix GroupNorm init #10494

Merged
merged 3 commits into from
May 20, 2024
Merged

fix GroupNorm init #10494

merged 3 commits into from
May 20, 2024

Conversation

fpzh2011
Copy link
Contributor

Creating GroupNorm with device and dtype throws Exceptions.

import oneflow as flow
m = flow.nn.GroupNorm(2, 3, device=flow.device("cpu"), dtype=flow.float32)

Exception messages:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/zhengjianhua/oneflow/python/oneflow/nn/modules/normalization.py", line 140, in __init__
    self.weight = flow.nn.Parameter(flow.Tensor(num_channels, **factory_kwargs))
TypeError: Error: _legacy_tensor_ctor(): received an invalid combination of arguments. The valid signatures are:
        *0: Tensor (*, Device device=None)
        *1: Tensor (*, Placement placement, SbpList sbp)
        *2: Tensor (Tensor other)
        *3: Tensor (PyObject* data, *, Device device=None)
        *4: Tensor (PyObject* data, *, Placement placement, SbpList sbp)
        *5: Tensor (Shape size, *, Device device=None)
        *6: Tensor (Shape size, *, Placement placement, SbpList sbp)

Copy link
Contributor

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

Copy link
Contributor

@fpzh2011 fpzh2011 requested a review from ShawnXuan April 22, 2024 09:28
Copy link
Contributor

Copy link
Contributor

github-actions bot commented May 6, 2024

Speed stats:

@fpzh2011 fpzh2011 enabled auto-merge (squash) May 6, 2024 03:46
Copy link
Contributor

Copy link
Contributor

Speed stats:
GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.6ms (= 4357.7ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 56.6ms (= 5658.9ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.30 (= 56.6ms / 43.6ms)

OneFlow resnet50 time: 26.6ms (= 2657.9ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 38.2ms (= 3815.3ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.44 (= 38.2ms / 26.6ms)

OneFlow resnet50 time: 17.7ms (= 3536.9ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 35.5ms (= 7108.8ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 2.01 (= 35.5ms / 17.7ms)

OneFlow resnet50 time: 17.6ms (= 3511.0ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 32.2ms (= 6432.5ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.83 (= 32.2ms / 17.6ms)

OneFlow resnet50 time: 17.1ms (= 3414.7ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 29.0ms (= 5802.7ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.70 (= 29.0ms / 17.1ms)

OneFlow swin dataloader time: 0.202s (= 40.335s / 200, num_workers=1)
PyTorch swin dataloader time: 0.127s (= 25.378s / 200, num_workers=1)
Relative speed: 0.629 (= 0.127s / 0.202s)

OneFlow swin dataloader time: 0.058s (= 11.654s / 200, num_workers=4)
PyTorch swin dataloader time: 0.033s (= 6.614s / 200, num_workers=4)
Relative speed: 0.568 (= 0.033s / 0.058s)

OneFlow swin dataloader time: 0.030s (= 6.026s / 200, num_workers=8)
PyTorch swin dataloader time: 0.017s (= 3.385s / 200, num_workers=8)
Relative speed: 0.562 (= 0.017s / 0.030s)

❌ OneFlow resnet50 time: 49.4ms (= 4938.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.4ms (= 6642.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.35 (= 66.4ms / 49.4ms)

OneFlow resnet50 time: 36.5ms (= 3654.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 47.0ms (= 4701.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.29 (= 47.0ms / 36.5ms)

OneFlow resnet50 time: 27.6ms (= 5516.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 38.6ms (= 7723.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.40 (= 38.6ms / 27.6ms)

OneFlow resnet50 time: 25.1ms (= 5016.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 40.0ms (= 7997.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.59 (= 40.0ms / 25.1ms)

OneFlow resnet50 time: 24.9ms (= 4986.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 36.1ms (= 7224.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.45 (= 36.1ms / 24.9ms)

@fpzh2011 fpzh2011 merged commit b8c457c into master May 20, 2024
22 of 23 checks passed
@fpzh2011 fpzh2011 deleted the fix_group_norm branch May 20, 2024 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants