Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[Doc] Add page for PoolingParams
#4800 opened May 14, 2024 by DarkLight1337 Loading…
[CI/Build] PEP 517/518 improvements
#4791 opened May 13, 2024 by dtrifiro Loading…
Add GPTQ Marlin 2:4 sparse structured support
#4790 opened May 13, 2024 by alexm-nm Loading…
[WIP] Support long context lora
#4787 opened May 13, 2024 by rkooo567 Loading…
[Kernel] add bfloat16 support for gptq kernel
#4781 opened May 13, 2024 by jinzhen-lin Loading…
[Misc] Separate 'dtype' out as a parameter
#4778 opened May 13, 2024 by AllenDou Loading…
support QLoRA
#4776 opened May 12, 2024 by chenqianfzh Loading…
Sync huggingface modifications of qwen Moe model
#4774 opened May 12, 2024 by eigen2017 Loading…
[CI/Build] Platform agnostic wheel
#4773 opened May 12, 2024 by tomeras91 Loading…
[Misc] Logits processor plugins
#4769 opened May 11, 2024 by NadavShmayo Loading…
[Bugfix] Fix call to init_logger in openai server
#4765 opened May 11, 2024 by NadavShmayo Loading…
[Core][Bugfix]: fix prefix caching for blockv2
#4764 opened May 11, 2024 by leiwen83 Loading…
[Core][Distributed] add fast broadcast for tensor dict
#4757 opened May 11, 2024 by youkaichao Loading…
1 task
[Kernel] Add w8a8 CUTLASS kernels
#4749 opened May 10, 2024 by tlrmchlsmth Loading…
ProTip! What’s not been updated in a month: updated:<2024-04-13.