We present GDP, a generic document pretraining approach that learns rich representations from large-scale unlabeled document corpora. Our method improves document understanding across a range of downstream tasks by capturing the structural and semantic properties of documents through a unified pretraining objective.
@inproceedings{trivedi2024gdp,title={{GDP}: Generic Document Pretraining to Improve Document Understanding},author={Trivedi, Akkshita and Upadhyay, Akarsh and others},booktitle={18th International Conference on Document Analysis and Recognition (ICDAR)},year={2024},}