Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SME] Add scalable fp16->fp32 dense schedule #16981

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

lhutton1
Copy link
Contributor

@lhutton1 lhutton1 commented May 8, 2024

This commit extends the functionality of the SME dense and matmul schedules to support operations with fp16 inputs and an fp32 output, where transpose_a=False and transpose_b=True.

For convenience, it also adds a utility called get_vscale_factor which creates the correct multiplier for vscale given a data type, reflecting ideas from an early design of the SVE RFC.

Note: this commit depends on #16921 so also contains the contents of #16921.

This commit extends the functionality of the SME dense and matmul
schedules to support operations with fp16 inputs and an fp32 output,
where `transpose_a=False` and `transpose_b=True`.

For convenience, it also adds a utility called `get_vscale_factor`
which created the correct multiplier for `vscale` given a data type,
reflecting ideas from an early design of the
[SVE](apache/tvm-rfcs#104) RFC.

Change-Id: I8c00bc6baf2df6015fa41200a238781126c73589
Change-Id: Ie7fb7a0a76119aa5c82e03ea0b2cc10de9f15f5e
@lhutton1 lhutton1 force-pushed the sme-fp16-fp32-dense-schedule branch from d2a164c to 1fe9bac Compare May 15, 2024 12:11
Change-Id: I0e9e45b285082b42676e53e74158e11d7e08608b
Change-Id: I32273241ae7569b65e082759e4f2ca4355ac6933
@lhutton1 lhutton1 marked this pull request as ready for review May 16, 2024 07:58
@lhutton1
Copy link
Contributor Author

lhutton1 commented May 16, 2024

cc @ekalda @Anndrey24 @leandron

Copy link
Contributor

@ekalda ekalda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lhutton1, really cool stuff! I only have nits.

tests/python/relay/strategy/arm_cpu/test_dense.py Outdated Show resolved Hide resolved
python/tvm/tir/op.py Outdated Show resolved Hide resolved
python/tvm/tir/op.py Outdated Show resolved Hide resolved
python/tvm/tir/op.py Outdated Show resolved Hide resolved
python/tvm/topi/arm_cpu/dense_alter_op.py Show resolved Hide resolved
python/tvm/tir/tensor_intrin/arm_cpu.py Show resolved Hide resolved
Change-Id: I237b4c5cb5ca22e33529d98cbd75177b94904857
@lhutton1 lhutton1 force-pushed the sme-fp16-fp32-dense-schedule branch from 0d2be71 to bc02e47 Compare May 22, 2024 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants