diff --git a/README.md b/README.md index 88b593f924..3fe3905e25 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,7 @@ ______________________________________________________________________ ## News 🎉 +- \[2023/12\] Turbomind supports multimodal input. [Gradio Demo](./examples/vl/README.md) - \[2023/11\] Turbomind supports loading hf model directly. Click [here](./docs/en/load_hf.md) for details. - \[2023/11\] TurboMind major upgrades, including: Paged Attention, faster attention kernels without sequence length limitation, 2x faster KV8 kernels, Split-K decoding (Flash Decoding), and W4A16 inference for sm_75 - \[2023/09\] TurboMind supports Qwen-14B diff --git a/README_zh-CN.md b/README_zh-CN.md index 1f3e24a36a..0f734d9cda 100644 --- a/README_zh-CN.md +++ b/README_zh-CN.md @@ -20,6 +20,7 @@ ______________________________________________________________________ ## 更新 🎉 +- \[2023/12\] Turbomind 支持多模态输入。[Gradio Demo](./examples/vl/README.md) - \[2023/11\] Turbomind 支持直接读取 Huggingface 模型。点击[这里](./docs/en/load_hf.md)查看使用方法 - \[2023/11\] TurboMind 重磅升级。包括:Paged Attention、更快的且不受序列最大长度限制的 attention kernel、2+倍快的 KV8 kernels、Split-K decoding (Flash Decoding) 和 支持 sm_75 架构的 W4A16 - \[2023/09\] TurboMind 支持 Qwen-14B diff --git a/examples/vl/app.py b/examples/vl/app.py index bb1b109594..735fd573d4 100644 --- a/examples/vl/app.py +++ b/examples/vl/app.py @@ -180,7 +180,9 @@ def cancel(chatbot, session): def reset(session): stop(session) - return [], Session(), enable_btn + session._step = 0 + session._message = [] + return [], session, enable_btn with gr.Blocks(css=CSS, theme=THEME) as demo: with gr.Column(elem_id='container'):