Release LMDeploy Release V0.2.6 · InternLM/lmdeploy

Highlight

Support vision-languange models (VLM) inference pipeline and serving.
Currently, it supports the following models, Qwen-VL-Chat, LLaVA series v1.5, v1.6 and Yi-VL

VLM Inference Pipeline

from lmdeploy import pipeline
from lmdeploy.vl import load_image

pipe = pipeline('liuhaotian/llava-v1.6-vicuna-7b')

image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
response = pipe(('describe this image', image))
print(response)

Please refer to the detailed guide from here

VLM serving by openai compatible server

lmdeploy server api_server liuhaotian/llava-v1.6-vicuna-7b --server-port 8000

VLM Serving by gradio

lmdeploy serve gradio liuhaotian/llava-v1.6-vicuna-7b --server-port 6006

What's Changed

🚀 Features

Add inference pipeline for VL models by @irexyc in #1214
Support serving VLMs by @AllentDan in #1285
Serve VLM by gradio by @irexyc in #1293
Add pipeline.chat api for easy use by @irexyc in #1292

💥 Improvements

Hide qos functions from swagger UI if not applied by @AllentDan in #1238
Color log formatter by @grimoire in #1247
optimize filling kv cache kernel in pytorch engine by @grimoire in #1251
Refactor chat template and support accurate name matching. by @AllentDan in #1216
Support passing json file to chat template by @AllentDan in #1200
upgrade peft and check adapters by @grimoire in #1284
better cache allocation in pytorch engine by @grimoire in #1272
Fall back to base template if there is no chat_template in tokenizer_config.json by @AllentDan in #1294

🐞 Bug fixes

lazy load convert_pv jit function by @grimoire in #1253
[BUG] fix the case when num_used_blocks < 0 by @jjjjohnson in #1277
Check bf16 model in torch engine by @grimoire in #1270
fix bf16 check by @grimoire in #1281
[Fix] fix triton server chatbot init error by @AllentDan in #1278
Fix concatenate issue in profile serving by @ispobock in #1282
fix torch tp lora adapter by @grimoire in #1300
Fix crash when api_server loads a turbomind model by @irexyc in #1304

📚 Documentations

fix config for readthedocs by @RunningLeon in #1245
update badges in README by @lvhan028 in #1243
Update serving guide including api_server and gradio by @lvhan028 in #1248
rename restful_api.md to api_server.md by @lvhan028 in #1287
Update readthedocs index by @lvhan028 in #1288

🌐 Other

Parallelize testcase and refactor test workflow by @zhulinJulia24 in #1254
Accelerate sample request in benchmark script by @ispobock in #1264
Update eval ci cfg by @RunningLeon in #1259
Test case bugfix and add restful interface testcases. by @zhulinJulia24 in #1271
bump version to v0.2.6 by @lvhan028 in #1299

New Contributors

@jjjjohnson made their first contribution in #1277

Full Changelog: v0.2.5...v0.2.6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LMDeploy Release V0.2.6