Skip to content

Official implementation based on MMOCR for paper "SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection".

License

Notifications You must be signed in to change notification settings

LumionHXJ/SegHist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

(ICDAR 2024) SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection

arXiv

GitHub watchers

GitHub stars

Visits Badge

Official implementation based on MMOCR for paper "SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection".

🔍 Examples

Groundtruth Prediction
gt1 pred1
gt2 pred2

📄 Abstract

Text line detection is a key task in historical document analysis facing many challenges of arbitrary-shaped text lines, dense texts, and text lines with high aspect ratios, etc. In this paper, we propose a general Segmentation-based framework for Historical document text detection (SegHist), enabling existing text detection methods to effectively address the challenges, especially text lines with high aspect ratios. Integrating the SegHist framework with the commonly used method DB++, we develop DB-SegHist. This approach achieves SOTA on the CHDAC, MTHv2, and competitive results on HDRC datasets, with a significant improvement of 1.19% on the most challenging CHDAC dataset which features more text lines with high aspect ratios. Moreover, our method attains SOTA on rotated MTHv2 and rotated HDRC, demonstrating its rotational robustness.

⚙️ Requirements

Installing using config:

conda env create -f environment.yml

Or installing step-by-step:

conda create --name openmmlab python=3.8 -y
conda activate openmmlab
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 -c pytorch
pip install -U openmim
mim install mmengine==0.10.4 mmcv==2.0.1 mmdet==3.0.0 mmocr==1.0.0rc5

🚀 Training

Training DB-SegHist as example (training other model by changing checkpoint):

python -m torch.distributed.run --nproc_per_node=4 train.py  --launcher pytorch --work-dir work_dirs/ config/seghist/seghist_resnet50-dcnv2_fpnc.py

🧠 Inferencing

python test.py --work-dir work_dirs/ config/seghist/seghist_resnet50-dcnv2_fpnc.py [your_checkpoint]

📚 Acquiring Data

The data we used can be accessed as follows:

🏆 Our Results on CHDAC

Method P R F
EAST [Zhou et al. 2017] 61.41 73.13 66.76
Mask R-CNN [He et al. 2017] 89.03 80.90 84.77
Cascade R-CNN [Cai et al. 2018] 92.82 83.63 87.98
OBD [Liu et al. 2021] 94.73 81.52 87.63
TextSnake [Long et al. 2018] 96.33 89.62 92.85
PSENet [Wang et al. 2019] 76.99 89.62 82.83
PAN [Wang et al. 2019] 92.74 85.71 89.09
FCENet [Zhu et al. 2021] 88.42 85.04 86.70
DBNet++ [Liao et al. 2022] 91.39 89.15 90.26
HisDoc R-CNN [Jian et al. 2023] 98.19 93.74 95.92
PSE-SegHist (ours) 97.00 95.31 96.15
PAN-SegHist (ours) 97.52 94.77 96.12
DB-SegHist (ours) 98.36 95.88 97.11

*P, R, and F indicate the precision, recall, and F-measure, respectively, at an IoU threshold of 0.5.

🔒 LICENSE

This code is distributed under the Apache License. Please note that the datasets we rely on may not be allowed for commercial use.

🔗 CITATION

@inproceedings{hu2024seghist,
  title={SegHist: A General Segmentation-Based Framework for Chinese Historical Document Text Line Detection},
  author={Hu, Xingjian and Wei, Baole and Gao, Liangcai and Wang, Jun},
  booktitle={International Conference on Document Analysis and Recognition},
  pages={391--410},
  year={2024},
  organization={Springer}
}

📧 CONTACT US

If you have any question, please contact: huxingjian@pku.edu.cn.

About

Official implementation based on MMOCR for paper "SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection".

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages