Skip to content

Navigation Menu

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

vllm-project / vllm Public

Notifications
Fork 2.6k
Star 19.4k

Code
Issues 812
Pull requests 225
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 41 Milestones 0

Labels 41 Milestones 0

New pull request New

225 Open 1,650 Closed

225 Open 1,650 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Doc] Add page for PoolingParams

#4800 opened May 14, 2024 by DarkLight1337

Loading…

1

[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model

#4799 opened May 14, 2024 by linxihui

Loading…

[Build/CI] Extending the set of AMD tests with Regression, Basic Correctness, Distributed, Engine, Llava Tests

#4797 opened May 13, 2024 by Alexei-V-Ivanov-AMD

Loading…

[Frontend] Support OpenAI batch file format

#4794 opened May 13, 2024 by wuisawesome

Loading…

9

[CI/Build] PEP 517/518 improvements

#4791 opened May 13, 2024 by dtrifiro

Loading…

1

Add GPTQ Marlin 2:4 sparse structured support

#4790 opened May 13, 2024 by alexm-nm

Loading…

1

[Kernel] add bfloat16 support for gptq marlin kernel

#4788 opened May 13, 2024 by jinzhen-lin

Loading…

6

[WIP] Support long context lora

#4787 opened May 13, 2024 by rkooo567

Loading…

2

[Kernel] add bfloat16 support for gptq kernel

#4781 opened May 13, 2024 by jinzhen-lin

Loading…

[Misc] Separate 'dtype' out as a parameter

#4778 opened May 13, 2024 by AllenDou

Loading…

2

support QLoRA

#4776 opened May 12, 2024 by chenqianfzh

Loading…

8

[core] SequenceController in SamplingParams

#4775 opened May 12, 2024 by mmoskal • Draft

Sync huggingface modifications of qwen Moe model

#4774 opened May 12, 2024 by eigen2017

Loading…

3

[CI/Build] Platform agnostic wheel

#4773 opened May 12, 2024 by tomeras91

Loading…

2

[Misc] Logits processor plugins

#4769 opened May 11, 2024 by NadavShmayo

Loading…

1

[Kernel] sliding window support in paged_attention_v1/v2 kernels

#4768 opened May 11, 2024 by mmoskal • Draft

2

[Bugfix] Fix call to init_logger in openai server

#4765 opened May 11, 2024 by NadavShmayo

Loading…

[Core][Bugfix]: fix prefix caching for blockv2

#4764 opened May 11, 2024 by leiwen83

Loading…

1

[CI/Build] Enable entrypoints tests to be run in a single command

#4759 opened May 11, 2024 by DarkLight1337

Loading…

[Frontend] Re-enable custom roles in Chat Completions API

#4758 opened May 11, 2024 by DarkLight1337

Loading…

[Core][Distributed] add fast broadcast for tensor dict

#4757 opened May 11, 2024 by youkaichao

Loading…

1 task

Support fp8 KV cache in context_attention_fwd

#4753 opened May 10, 2024 by Yard1 • Draft

[Kernel] Add w8a8 CUTLASS kernels

#4749 opened May 10, 2024 by tlrmchlsmth

Loading…

[CI/Build] use setuptools-scm to set __version__

#4738 opened May 10, 2024 by dtrifiro • Draft

3

[CI/Build] Enforce style for C++ and CUDA code with clang-format

#4722 opened May 9, 2024 by mgoin

Loading…

3

Previous 1 2 3 4 5 … 8 9 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2024-04-13.

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.