modify clip_grad with no to_global #10443

hanwen-sun · 2024-03-11T03:38:44Z

去掉clip_grad 范数计算中的第一个to_global, 以减少在tensor parallel情况下不必要的 all gather

github-actions · 2024-03-11T03:39:53Z

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

levi131

LGTM

github-actions · 2024-03-13T14:35:16Z

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10443/

github-actions · 2024-03-13T14:41:30Z

Speed stats:

GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 44.0ms (= 4398.0ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 58.1ms (= 5810.7ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.32 (= 58.1ms / 44.0ms)

OneFlow resnet50 time: 26.1ms (= 2606.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 38.5ms (= 3845.4ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.48 (= 38.5ms / 26.1ms)

OneFlow resnet50 time: 19.1ms (= 3815.7ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 35.9ms (= 7176.3ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.88 (= 35.9ms / 19.1ms)

OneFlow resnet50 time: 16.9ms (= 3383.3ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 31.7ms (= 6337.6ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.87 (= 31.7ms / 16.9ms)

OneFlow resnet50 time: 22.3ms (= 4460.8ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 29.5ms (= 5908.5ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.32 (= 29.5ms / 22.3ms)

OneFlow swin dataloader time: 0.202s (= 40.326s / 200, num_workers=1)
PyTorch swin dataloader time: 0.128s (= 25.677s / 200, num_workers=1)
Relative speed: 0.637 (= 0.128s / 0.202s)

OneFlow swin dataloader time: 0.054s (= 10.740s / 200, num_workers=4)
PyTorch swin dataloader time: 0.040s (= 8.059s / 200, num_workers=4)
Relative speed: 0.750 (= 0.040s / 0.054s)

OneFlow swin dataloader time: 0.031s (= 6.112s / 200, num_workers=8)
PyTorch swin dataloader time: 0.016s (= 3.300s / 200, num_workers=8)
Relative speed: 0.540 (= 0.016s / 0.031s)

❌ OneFlow resnet50 time: 49.1ms (= 4909.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.7ms (= 6972.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.42 (= 69.7ms / 49.1ms)

OneFlow resnet50 time: 35.7ms (= 3566.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 48.3ms (= 4829.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.35 (= 48.3ms / 35.7ms)

OneFlow resnet50 time: 29.1ms (= 5824.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 44.3ms (= 8851.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.52 (= 44.3ms / 29.1ms)

OneFlow resnet50 time: 25.9ms (= 5183.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 41.1ms (= 8216.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.58 (= 41.1ms / 25.9ms)

OneFlow resnet50 time: 24.0ms (= 4791.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 36.3ms (= 7258.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.51 (= 36.3ms / 24.0ms)

levi131 · 2024-03-14T01:24:52Z

ci中clip_grad相关的单测没有通过，需要再调试一下

github-actions · 2024-03-14T01:33:56Z

CI failed when running job: cuda-misc. PR label automerge has been removed

github-actions · 2024-03-26T06:18:40Z

CI failed when running job: cuda-module. PR label automerge has been removed

github-actions · 2024-03-26T07:59:49Z

CI failed when running job: cuda-module. PR label automerge has been removed

github-actions · 2024-03-26T08:54:21Z

CI failed when running job: cuda-module. PR label automerge has been removed

github-actions · 2024-04-03T05:57:30Z

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10443/

github-actions · 2024-04-03T06:57:10Z

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

github-actions · 2024-04-03T07:05:53Z

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10443/

github-actions · 2024-04-07T14:40:28Z

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10443/

github-actions · 2024-04-08T00:33:48Z

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

github-actions · 2024-04-08T00:42:50Z

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10443/

github-actions · 2024-04-08T01:24:23Z

Speed stats:

GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.7ms (= 4372.3ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 57.8ms (= 5775.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.32 (= 57.8ms / 43.7ms)

OneFlow resnet50 time: 26.2ms (= 2622.2ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 38.0ms (= 3801.6ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.45 (= 38.0ms / 26.2ms)

OneFlow resnet50 time: 19.1ms (= 3814.3ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 35.7ms (= 7133.9ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.87 (= 35.7ms / 19.1ms)

OneFlow resnet50 time: 16.4ms (= 3286.1ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 34.2ms (= 6833.8ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 2.08 (= 34.2ms / 16.4ms)

OneFlow resnet50 time: 17.3ms (= 3460.1ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 29.5ms (= 5908.9ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.71 (= 29.5ms / 17.3ms)

OneFlow swin dataloader time: 0.199s (= 39.800s / 200, num_workers=1)
PyTorch swin dataloader time: 0.130s (= 25.972s / 200, num_workers=1)
Relative speed: 0.653 (= 0.130s / 0.199s)

OneFlow swin dataloader time: 0.056s (= 11.289s / 200, num_workers=4)
PyTorch swin dataloader time: 0.033s (= 6.521s / 200, num_workers=4)
Relative speed: 0.578 (= 0.033s / 0.056s)

OneFlow swin dataloader time: 0.032s (= 6.384s / 200, num_workers=8)
PyTorch swin dataloader time: 0.018s (= 3.696s / 200, num_workers=8)
Relative speed: 0.579 (= 0.018s / 0.032s)

❌ OneFlow resnet50 time: 49.2ms (= 4920.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 65.5ms (= 6548.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.33 (= 65.5ms / 49.2ms)

OneFlow resnet50 time: 36.3ms (= 3626.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 44.9ms (= 4489.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.24 (= 44.9ms / 36.3ms)

OneFlow resnet50 time: 27.6ms (= 5529.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 38.6ms (= 7729.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.40 (= 38.6ms / 27.6ms)

OneFlow resnet50 time: 25.0ms (= 5006.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 38.6ms (= 7716.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.54 (= 38.6ms / 25.0ms)

OneFlow resnet50 time: 24.8ms (= 4953.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 36.1ms (= 7218.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.46 (= 36.1ms / 24.8ms)

github-actions · 2024-04-08T01:45:12Z

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10443/

github-actions · 2024-04-08T02:27:01Z

Speed stats:

GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.7ms (= 4370.5ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 57.9ms (= 5785.8ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.32 (= 57.9ms / 43.7ms)

OneFlow resnet50 time: 26.1ms (= 2607.5ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 38.0ms (= 3796.4ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.46 (= 38.0ms / 26.1ms)

OneFlow resnet50 time: 18.3ms (= 3666.2ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 34.3ms (= 6856.0ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.87 (= 34.3ms / 18.3ms)

OneFlow resnet50 time: 17.2ms (= 3444.0ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 31.2ms (= 6241.2ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.81 (= 31.2ms / 17.2ms)

OneFlow resnet50 time: 16.7ms (= 3334.0ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 28.3ms (= 5651.0ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.69 (= 28.3ms / 16.7ms)

OneFlow swin dataloader time: 0.198s (= 39.676s / 200, num_workers=1)
PyTorch swin dataloader time: 0.128s (= 25.627s / 200, num_workers=1)
Relative speed: 0.646 (= 0.128s / 0.198s)

OneFlow swin dataloader time: 0.055s (= 11.083s / 200, num_workers=4)
PyTorch swin dataloader time: 0.032s (= 6.457s / 200, num_workers=4)
Relative speed: 0.583 (= 0.032s / 0.055s)

OneFlow swin dataloader time: 0.031s (= 6.240s / 200, num_workers=8)
PyTorch swin dataloader time: 0.017s (= 3.368s / 200, num_workers=8)
Relative speed: 0.540 (= 0.017s / 0.031s)

❌ OneFlow resnet50 time: 49.4ms (= 4936.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 65.9ms (= 6591.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 65.9ms / 49.4ms)

OneFlow resnet50 time: 36.6ms (= 3656.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 44.6ms (= 4460.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.22 (= 44.6ms / 36.6ms)

OneFlow resnet50 time: 27.8ms (= 5561.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 39.4ms (= 7885.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.42 (= 39.4ms / 27.8ms)

OneFlow resnet50 time: 25.5ms (= 5103.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 38.2ms (= 7645.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.50 (= 38.2ms / 25.5ms)

OneFlow resnet50 time: 25.0ms (= 4995.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 36.2ms (= 7235.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.45 (= 36.2ms / 25.0ms)

github-actions · 2024-04-16T08:16:37Z

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

github-actions · 2024-04-17T02:48:50Z

Speed stats:

github-actions · 2024-04-17T09:02:42Z

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10443/

github-actions · 2024-04-17T09:49:09Z

Speed stats:

GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.8ms (= 4378.7ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 58.1ms (= 5806.3ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.33 (= 58.1ms / 43.8ms)

OneFlow resnet50 time: 26.8ms (= 2675.1ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 37.9ms (= 3794.8ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.42 (= 37.9ms / 26.8ms)

OneFlow resnet50 time: 18.6ms (= 3724.7ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 37.0ms (= 7393.5ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.99 (= 37.0ms / 18.6ms)

OneFlow resnet50 time: 15.9ms (= 3183.7ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 30.9ms (= 6171.0ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.94 (= 30.9ms / 15.9ms)

OneFlow resnet50 time: 17.5ms (= 3509.0ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 29.4ms (= 5871.0ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.67 (= 29.4ms / 17.5ms)

OneFlow swin dataloader time: 0.201s (= 40.136s / 200, num_workers=1)
PyTorch swin dataloader time: 0.129s (= 25.741s / 200, num_workers=1)
Relative speed: 0.641 (= 0.129s / 0.201s)

OneFlow swin dataloader time: 0.052s (= 10.493s / 200, num_workers=4)
PyTorch swin dataloader time: 0.033s (= 6.639s / 200, num_workers=4)
Relative speed: 0.633 (= 0.033s / 0.052s)

OneFlow swin dataloader time: 0.030s (= 5.987s / 200, num_workers=8)
PyTorch swin dataloader time: 0.016s (= 3.298s / 200, num_workers=8)
Relative speed: 0.551 (= 0.016s / 0.030s)

❌ OneFlow resnet50 time: 49.3ms (= 4934.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.0ms (= 6596.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 66.0ms / 49.3ms)

OneFlow resnet50 time: 37.0ms (= 3701.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 47.3ms (= 4725.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.28 (= 47.3ms / 37.0ms)

OneFlow resnet50 time: 27.6ms (= 5529.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 38.5ms (= 7699.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.39 (= 38.5ms / 27.6ms)

OneFlow resnet50 time: 25.0ms (= 5008.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 40.3ms (= 8068.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.61 (= 40.3ms / 25.0ms)

OneFlow resnet50 time: 24.6ms (= 4922.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 36.0ms (= 7206.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.46 (= 36.0ms / 24.6ms)

github-actions · 2024-04-17T10:25:23Z

Speed stats:

GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.9ms (= 4393.5ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 57.5ms (= 5751.8ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.31 (= 57.5ms / 43.9ms)

OneFlow resnet50 time: 26.6ms (= 2659.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 38.2ms (= 3816.2ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.43 (= 38.2ms / 26.6ms)

OneFlow resnet50 time: 17.7ms (= 3543.4ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 34.4ms (= 6878.0ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.94 (= 34.4ms / 17.7ms)

OneFlow resnet50 time: 16.4ms (= 3283.8ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 30.7ms (= 6149.7ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.87 (= 30.7ms / 16.4ms)

OneFlow resnet50 time: 16.5ms (= 3301.3ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 29.8ms (= 5965.3ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.81 (= 29.8ms / 16.5ms)

OneFlow swin dataloader time: 0.200s (= 39.976s / 200, num_workers=1)
PyTorch swin dataloader time: 0.128s (= 25.586s / 200, num_workers=1)
Relative speed: 0.640 (= 0.128s / 0.200s)

OneFlow swin dataloader time: 0.056s (= 11.252s / 200, num_workers=4)
PyTorch swin dataloader time: 0.033s (= 6.562s / 200, num_workers=4)
Relative speed: 0.583 (= 0.033s / 0.056s)

OneFlow swin dataloader time: 0.032s (= 6.326s / 200, num_workers=8)
PyTorch swin dataloader time: 0.017s (= 3.360s / 200, num_workers=8)
Relative speed: 0.531 (= 0.017s / 0.032s)

❌ OneFlow resnet50 time: 49.5ms (= 4953.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.2ms (= 6618.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 66.2ms / 49.5ms)

OneFlow resnet50 time: 35.8ms (= 3581.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 45.5ms (= 4550.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.27 (= 45.5ms / 35.8ms)

OneFlow resnet50 time: 28.0ms (= 5605.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 39.8ms (= 7951.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.42 (= 39.8ms / 28.0ms)

OneFlow resnet50 time: 25.3ms (= 5067.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 39.1ms (= 7827.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.54 (= 39.1ms / 25.3ms)

OneFlow resnet50 time: 24.4ms (= 4882.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 35.7ms (= 7144.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.46 (= 35.7ms / 24.4ms)

hanwen-sun · 2024-05-06T06:51:20Z

问题

该pr目前仍存在一个问题:
clip_grad的1n2d的测试通不过, 我在相同的硬件设备(26, 28机器)上使用与ci环境相同的docker, 并使用该pr编译好的whl, 依旧无法复现ci中的问题。

modify clip_grad

6560c00

hanwen-sun requested review from MARD1NO and levi131 March 11, 2024 03:38

auto format by CI

1e35800

levi131 approved these changes Mar 11, 2024

View reviewed changes

levi131 requested a review from oneflow-ci-bot March 11, 2024 10:15

levi131 added enhancement automerge system labels Mar 13, 2024

levi131 enabled auto-merge (squash) March 13, 2024 14:03

levi131 self-requested a review March 14, 2024 01:24

github-actions bot removed the automerge label Mar 14, 2024

modify test_clip_grad

355dfe2

hanwen-sun and others added 2 commits April 3, 2024 14:55

test

dddf91e

auto format by CI

908f1c6

Merge branch 'master' into modify_clip_grad_with_no_to_global

766bf1e

hanwen-sun and others added 2 commits April 8, 2024 08:32

add comment

ac47c0d

auto format by CI

a19966e

hanwen-sun requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 8, 2024 01:33

levi131 mentioned this pull request Apr 16, 2024

modify clip grad #10441

Closed

levi131 and others added 2 commits April 16, 2024 08:15

recover threshold of 1e-4

381b985

auto format by CI

e55a7be

Merge branch 'master' into modify_clip_grad_with_no_to_global

8eadd8e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modify clip_grad with no to_global #10443

modify clip_grad with no to_global #10443

hanwen-sun commented Mar 11, 2024

github-actions bot commented Mar 11, 2024

levi131 left a comment

github-actions bot commented Mar 13, 2024

github-actions bot commented Mar 13, 2024

levi131 commented Mar 14, 2024

github-actions bot commented Mar 14, 2024

github-actions bot commented Mar 26, 2024

github-actions bot commented Mar 26, 2024

github-actions bot commented Mar 26, 2024

github-actions bot commented Apr 3, 2024

github-actions bot commented Apr 3, 2024

github-actions bot commented Apr 3, 2024

github-actions bot commented Apr 7, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 16, 2024

github-actions bot commented Apr 17, 2024

github-actions bot commented Apr 17, 2024

github-actions bot commented Apr 17, 2024

github-actions bot commented Apr 17, 2024

hanwen-sun commented May 6, 2024

modify clip_grad with no to_global #10443

Are you sure you want to change the base?

modify clip_grad with no to_global #10443

Conversation

hanwen-sun commented Mar 11, 2024

github-actions bot commented Mar 11, 2024

levi131 left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 13, 2024

github-actions bot commented Mar 13, 2024

levi131 commented Mar 14, 2024

github-actions bot commented Mar 14, 2024

github-actions bot commented Mar 26, 2024

github-actions bot commented Mar 26, 2024

github-actions bot commented Mar 26, 2024

github-actions bot commented Apr 3, 2024

github-actions bot commented Apr 3, 2024

github-actions bot commented Apr 3, 2024

github-actions bot commented Apr 7, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 16, 2024

github-actions bot commented Apr 17, 2024

github-actions bot commented Apr 17, 2024

github-actions bot commented Apr 17, 2024

github-actions bot commented Apr 17, 2024

hanwen-sun commented May 6, 2024

问题