diff --git a/README.md b/README.md
index 88b593f924..3fe3905e25 100644
--- a/README.md
+++ b/README.md
@@ -20,6 +20,7 @@ ______________________________________________________________________
 
 ## News 🎉
 
+- \[2023/12\] Turbomind supports multimodal input. [Gradio Demo](./examples/vl/README.md)
 - \[2023/11\] Turbomind supports loading hf model directly. Click [here](./docs/en/load_hf.md) for details.
 - \[2023/11\] TurboMind major upgrades, including: Paged Attention, faster attention kernels without sequence length limitation, 2x faster KV8 kernels, Split-K decoding (Flash Decoding), and W4A16 inference for sm_75
 - \[2023/09\] TurboMind supports Qwen-14B
diff --git a/README_zh-CN.md b/README_zh-CN.md
index 1f3e24a36a..0f734d9cda 100644
--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -20,6 +20,7 @@ ______________________________________________________________________
 
 ## 更新 🎉
 
+- \[2023/12\] Turbomind 支持多模态输入。[Gradio Demo](./examples/vl/README.md)
 - \[2023/11\] Turbomind 支持直接读取 Huggingface 模型。点击[这里](./docs/en/load_hf.md)查看使用方法
 - \[2023/11\] TurboMind 重磅升级。包括：Paged Attention、更快的且不受序列最大长度限制的 attention kernel、2+倍快的 KV8 kernels、Split-K decoding (Flash Decoding) 和 支持 sm_75 架构的 W4A16
 - \[2023/09\] TurboMind 支持 Qwen-14B
diff --git a/examples/vl/app.py b/examples/vl/app.py
index bb1b109594..735fd573d4 100644
--- a/examples/vl/app.py
+++ b/examples/vl/app.py
@@ -180,7 +180,9 @@ def cancel(chatbot, session):
 
     def reset(session):
         stop(session)
-        return [], Session(), enable_btn
+        session._step = 0
+        session._message = []
+        return [], session, enable_btn
 
     with gr.Blocks(css=CSS, theme=THEME) as demo:
         with gr.Column(elem_id='container'):