-
Notifications
You must be signed in to change notification settings - Fork 445
Open
Description
Hey!
Do flashinfer has B200 fp8 single prefill kernels different than CUTLASS implementation? as far as I can see there is not but wanted to double check. I am planning to use mainly on Diffusion models, since batched prefill kernels are mostly designed for streaming, I believe only options is the cutlass backend. I wonder your thoughts and am I missing something?
Thanks
Metadata
Metadata
Assignees
Labels
No labels