Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model
#4799
opened May 14, 2024 by
linxihui
Loading…
[Build/CI] Extending the set of AMD tests with Regression, Basic Correctness, Distributed, Engine, Llava Tests
#4797
opened May 13, 2024 by
Alexei-V-Ivanov-AMD
Loading…
[Kernel] add bfloat16 support for gptq marlin kernel
#4788
opened May 13, 2024 by
jinzhen-lin
Loading…
[CI/Build] Enable entrypoints tests to be run in a single command
#4759
opened May 11, 2024 by
DarkLight1337
Loading…
[Frontend] Re-enable custom roles in Chat Completions API
#4758
opened May 11, 2024 by
DarkLight1337
Loading…
[Core][Distributed] add fast broadcast for tensor dict
#4757
opened May 11, 2024 by
youkaichao
Loading…
1 task
[CI/Build] Enforce style for C++ and CUDA code with
clang-format
#4722
opened May 9, 2024 by
mgoin
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-04-13.