LMDeploy Release V0.2.3
What's Changed
🚀 Features
💥 Improvements
- Remove caching tokenizer.json by @grimoire in #1074
- Refactor
get_logger
to remove the dependency of MMLogger from mmengine by @yinfan98 in #1064 - Use TM_LOG_LEVEL environment variable first by @zhyncs in #1071
- Speed up the initialization of w8a8 model for torch engine by @yinfan98 in #1088
- Make logging.logger's behavior consistent with MMLogger by @irexyc in #1092
- Remove owned_session for torch engine by @grimoire in #1097
- Unify engine initialization in pipeline by @irexyc in #1085
- Add skip_special_tokens in GenerationConfig by @grimoire in #1091
- Use default stop words for turbomind backend in pipeline by @irexyc in #1119
- Add input_token_len to Response and update Response document by @AllentDan in #1115
🐞 Bug fixes
- Fix fast tokenizer swallows prefix space when there are too many white spaces by @AllentDan in #992
- Fix turbomind CUDA runtime error invalid argument by @zhyncs in #1100
- Add safety check for incremental decode by @AllentDan in #1094
- Fix device type of get_ppl for turbomind by @RunningLeon in #1093
- Fix pipeline init turbomind from workspace by @irexyc in #1126
- Add dependency version check and fix
ignore_eos
logic by @grimoire in #1099 - Change configuration_internlm.py to configuration_internlm2.py by @HIT-cwh in #1129
📚 Documentations
🌐 Other
New Contributors
Full Changelog: v0.2.2...v0.2.3