Issues: flashinfer-ai/flashinfer
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug report] BatchPrefillWithPagedKVCachePyTorchWrapper failed to dispatch group_size 3
#258
opened May 24, 2024 by
merrymercy
Qwen1.5-32B failed: BatchPrefillWithPagedKVCachePyTorchWrapper failed to dispatch group_size 5
#254
opened May 23, 2024 by
QwertyJack
Can BatchDecodeWithPaddedKVCache be used in cascade inference?
#250
opened May 22, 2024 by
joey12300
CUDA Error: no kernel image is available for execution on the device (209) /tmp/build-via-sdist-nl8se4dx/flashinfer-0.0.4+cu118torch2.2/include/flashinfer/attention/decode.cuh: line 871 at function cudaFuncSetAttribute(kernel, cudaFuncAttributeMaxDynamicSharedMemorySize, smem_size)
#249
opened May 16, 2024 by
lucasjinreal
Circular import error when importing built-from-source flashinfer
#248
opened May 15, 2024 by
vedantroy
stack smashing detected in begin_forward when compiling directly from the repo
#166
opened Mar 8, 2024 by
mkrima
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.