(ICDAR 2024) SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection

Official implementation based on MMOCR for paper "SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection".

🔍 Examples

Groundtruth	Prediction

📄 Abstract

Text line detection is a key task in historical document analysis facing many challenges of arbitrary-shaped text lines, dense texts, and text lines with high aspect ratios, etc. In this paper, we propose a general Segmentation-based framework for Historical document text detection (SegHist), enabling existing text detection methods to effectively address the challenges, especially text lines with high aspect ratios. Integrating the SegHist framework with the commonly used method DB++, we develop DB-SegHist. This approach achieves SOTA on the CHDAC, MTHv2, and competitive results on HDRC datasets, with a significant improvement of 1.19% on the most challenging CHDAC dataset which features more text lines with high aspect ratios. Moreover, our method attains SOTA on rotated MTHv2 and rotated HDRC, demonstrating its rotational robustness.

⚙️ Requirements

Installing using config:

conda env create -f environment.yml

Or installing step-by-step:

conda create --name openmmlab python=3.8 -y
conda activate openmmlab
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 -c pytorch
pip install -U openmim
mim install mmengine==0.10.4 mmcv==2.0.1 mmdet==3.0.0 mmocr==1.0.0rc5

🚀 Training

Training DB-SegHist as example (training other model by changing checkpoint):

python -m torch.distributed.run --nproc_per_node=4 train.py  --launcher pytorch --work-dir work_dirs/ config/seghist/seghist_resnet50-dcnv2_fpnc.py

🧠 Inferencing

python test.py --work-dir work_dirs/ config/seghist/seghist_resnet50-dcnv2_fpnc.py [your_checkpoint]

📚 Acquiring Data

The data we used can be accessed as follows:

CHDAC: Contact their email or visit their official website.
MTHv2: https://github.com/HCIILAB/MTHv2_Datasets_Release
ICDAR2019: https://tc11.cvc.uab.es/datasets/ICDAR2019HDRC

🏆 Our Results on CHDAC

Method	P	R	F
EAST [Zhou et al. 2017]	61.41	73.13	66.76
Mask R-CNN [He et al. 2017]	89.03	80.90	84.77
Cascade R-CNN [Cai et al. 2018]	92.82	83.63	87.98
OBD [Liu et al. 2021]	94.73	81.52	87.63
TextSnake [Long et al. 2018]	96.33	89.62	92.85
PSENet [Wang et al. 2019]	76.99	89.62	82.83
PAN [Wang et al. 2019]	92.74	85.71	89.09
FCENet [Zhu et al. 2021]	88.42	85.04	86.70
DBNet++ [Liao et al. 2022]	91.39	89.15	90.26
HisDoc R-CNN [Jian et al. 2023]	98.19	93.74	95.92
PSE-SegHist (ours)	97.00	95.31	96.15
PAN-SegHist (ours)	97.52	94.77	96.12
DB-SegHist (ours)	98.36	95.88	97.11

*P, R, and F indicate the precision, recall, and F-measure, respectively, at an IoU threshold of 0.5.

🔒 LICENSE

This code is distributed under the Apache License. Please note that the datasets we rely on may not be allowed for commercial use.

🔗 CITATION

@inproceedings{hu2024seghist,
  title={SegHist: A General Segmentation-Based Framework for Chinese Historical Document Text Line Detection},
  author={Hu, Xingjian and Wei, Baole and Gao, Liangcai and Wang, Jun},
  booktitle={International Conference on Document Analysis and Recognition},
  pages={391--410},
  year={2024},
  organization={Springer}
}

📧 CONTACT US

If you have any question, please contact: huxingjian@pku.edu.cn.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(ICDAR 2024) SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection

🔍 Examples

📄 Abstract

⚙️ Requirements

🚀 Training

🧠 Inferencing

📚 Acquiring Data

🏆 Our Results on CHDAC

🔒 LICENSE

🔗 CITATION

📧 CONTACT US

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
config		config
samples		samples
seghist		seghist
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
test.py		test.py
train.py		train.py

License

LumionHXJ/SegHist

Folders and files

Latest commit

History

Repository files navigation

(ICDAR 2024) SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection

🔍 Examples

📄 Abstract

⚙️ Requirements

🚀 Training

🧠 Inferencing

📚 Acquiring Data

🏆 Our Results on CHDAC

🔒 LICENSE

🔗 CITATION

📧 CONTACT US

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages