You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@jcaip - Worth a try. Essentially you'd need to dequant within the score mod (before the softmax) and the inputs will have to be quantized. I think at this point only query and key could be quantized, because values will need to be matmul'd against by the result of the softmax.
hi all, I saw this tweet and thought of sharing it. The accuracy degration doesnt look too good, but maybe the speed makes it worth it?
https://x.com/papers_anon/status/1839131401322639805?s=46
To be clear: I am not requesting the feature, just mostly sharing it. Thanks! :)
The text was updated successfully, but these errors were encountered: