This dataset was created by the Natural Language Understanding and Human-Computer Interaction Laboratory of Shanghai University with the purpose of research on human-like visually-rich document understanding.
Note: The DocTrack dataset should only be used for non-commercial research purposes. For any person/institution/company working on this direction, please contact us for a commercial license.
DocTrack contains 539 images along with their eye-tracking order annotations. The original images are collected from the FUNSD, SEABILL and Inforgraphic VQA datasets. For more details, please refer to our paper accepted by EMNLP2023(findings) DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading.
@misc{wang2023dc,
title={DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading},
author={Hao Wang, Qingxuan Wang, Yue Li, Changqing Wang, Chenhui Chu and Rui Wang},
year={2023},
eprint={2310.14802},
archivePrefix={arXiv},
primaryClass={cs.HC}
}