-
Notifications
You must be signed in to change notification settings - Fork 115
Issues: flashinfer-ai/flashinfer
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Will AOT compilation still be supported after JIT compilation is added?
#510
opened Sep 25, 2024 by
danieldk
[feature request]: Support moving
num_layers
into a kv cache page (or support non-contiguous kv cache)
#506
opened Sep 25, 2024 by
reyoung
SingleDecodeWithKVCache meets illegal memory access when setting input tensors to cuda:1
bug
Something isn't working
#452
opened Aug 17, 2024 by
jason-huang03
[FEAT REQ][CUDA GRAPH] Allow explicit control flag to force enable/disable split KV
#397
opened Jul 26, 2024 by
AgrawalAmey
CUDA Error: no kernel image is available for execution on the device (209) /tmp/build-via-sdist-nl8se4dx/flashinfer-0.0.4+cu118torch2.2/include/flashinfer/attention/decode.cuh: line 871 at function cudaFuncSetAttribute(kernel, cudaFuncAttributeMaxDynamicSharedMemorySize, smem_size)
#249
opened May 16, 2024 by
lucasjinreal
Circular import error when importing built-from-source flashinfer
#248
opened May 15, 2024 by
vedantroy
stack smashing detected in begin_forward when compiling directly from the repo
#166
opened Mar 8, 2024 by
mkrima
Can I only profile dense layer or attention layer in flashinfer rather than the whole kernel?
#139
opened Feb 27, 2024 by
yintao-he
How to use low-bit KV Cache in flashinfer?
enhancement
New feature or request
#125
opened Feb 18, 2024 by
zhaoyang-star
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.