Skip to content

Navigation Menu

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

vllm-project / vllm Public

Notifications
Fork 2.6k
Star 19.5k

Code
Issues 828
Pull requests 230
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 41 Milestones 0

Labels 41 Milestones 0

New pull request New

230 Open 1,660 Closed

230 Open 1,660 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[WIP] [Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support)

#4837 opened May 15, 2024 by afeldman-nm • Draft

[Bugfix] fix rope error when load models with different dtypes

#4835 opened May 15, 2024 by jinzhen-lin

Loading…

[Build/CI] Enabling AMD Entrypoints Test

#4834 opened May 15, 2024 by Alexei-V-Ivanov-AMD

Loading…

[Hardware][Intel] Add LoRA adapter support for CPU backend

#4830 opened May 15, 2024 by Isotr0py • Draft

1

Support to serve vLLM on Kubernetes with LWS

#4829 opened May 15, 2024 by kerthcet

Loading…

6

[Core][Distributed] remove graph mode function

#4818 opened May 14, 2024 by youkaichao

Loading…

7

Add marlin unit tests and marlin benchmark script

#4815 opened May 14, 2024 by alexm-nm

Loading…

1

[Bugfix][Model] Add base class for vision-language models

#4809 opened May 14, 2024 by DarkLight1337

Loading…

[Speculative decoding] Enable TP>1 speculative decoding

#4808 opened May 14, 2024 by cadedaniel

Loading…

[Doc] Add page for PoolingParams

#4800 opened May 14, 2024 by DarkLight1337

Loading…

1

[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model

#4799 opened May 14, 2024 by linxihui

Loading…

[Build/CI] Extending the set of AMD tests with Regression, Basic Correctness, Distributed, Engine, Llava Tests

#4797 opened May 13, 2024 by Alexei-V-Ivanov-AMD

Loading…

2

[Frontend] Support OpenAI batch file format

#4794 opened May 13, 2024 by wuisawesome

Loading…

[CI/Build] PEP 517/518 improvements

#4791 opened May 13, 2024 by dtrifiro

Loading…

1

Add GPTQ Marlin 2:4 sparse structured support

#4790 opened May 13, 2024 by alexm-nm

Loading…

7

[Kernel] add bfloat16 support for gptq marlin kernel

#4788 opened May 13, 2024 by jinzhen-lin

Loading…

[Lora] Support long context lora

#4787 opened May 13, 2024 by rkooo567

Loading…

4

[Kernel] add bfloat16 support for gptq kernel

#4781 opened May 13, 2024 by jinzhen-lin

Loading…

[Misc] Separate 'dtype' out as a parameter

#4778 opened May 13, 2024 by AllenDou

Loading…

2

support QLoRA

#4776 opened May 12, 2024 by chenqianfzh

Loading…

9

[core] SequenceController in SamplingParams

#4775 opened May 12, 2024 by mmoskal • Draft

Sync huggingface modifications of qwen Moe model

#4774 opened May 12, 2024 by eigen2017

Loading…

7

[CI/Build] Platform agnostic wheel

#4773 opened May 12, 2024 by tomeras91

Loading…

2

[Misc] Logits processor plugins

#4769 opened May 11, 2024 by NadavShmayo

Loading…

5

[Kernel] sliding window support in paged_attention_v1/v2 kernels

#4768 opened May 11, 2024 by mmoskal • Draft

2

Previous 1 2 3 4 5 … 9 10 Next

Previous Next

ProTip! Updated in the last three days: updated:>2024-05-12.

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.