Python-based pipeline to prepare scanned PDFs in the DSCC collection for publication
Image correction -> OCR -> PDF resizing -> Coverpage addition -> Metadata embedding -> Final pdf output
- Place pdf in
data/input
- Add metadata to
data/metadata.csv
sh src/pipeline.sh
Written by Patrick J. Burns, ISAW Library; 2022-2023.