-
Notifications
You must be signed in to change notification settings - Fork 253
Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Refactor converter about get_input_model_registered_name and get_output_model_registered_name_and_config
#1702
opened Jun 3, 2024 by
lvhan028
Loading…
feat: skip invokeFlattenKV_v2_ when fp16 with CacheType::kBlock
#1683
opened May 29, 2024 by
zhyncs
Loading…
Add interfaces to the pipeline to obtain logits and ppl
enhancement
New feature or request
WIP
#1652
opened May 24, 2024 by
irexyc
Loading…
[Feature]: Support llava for pytorch engine
WIP
#1641
opened May 23, 2024 by
RunningLeon
Loading…
1 task
[benchmark] optimize benchmark: counting tokenlizer tokens and error requests
#1607
opened May 17, 2024 by
NiuBlibing
Loading…
fix: update api_server_backend.py to adapt latest gradio
improvement
#1541
opened May 3, 2024 by
kv-chiu
Loading…
Optimize kernel launch for triton2.2.0 and triton2.3.0
improvement
#1499
opened Apr 25, 2024 by
grimoire
Loading…
Add docs of support new vl model
documentation
Improvements or additions to documentation
#1332
opened Mar 22, 2024 by
irexyc
Loading…
remove chat template config in turbomind engine
BC-breaking
#1161
opened Feb 20, 2024 by
irexyc
Loading…
Visualize layer activations and weights to simplify the quantization process.
#607
opened Oct 24, 2023 by
HIT-cwh
Loading…
ProTip!
Mix and match filters to narrow down what you’re looking for.