DocTrack Dataset

This dataset was created by the Natural Language Understanding and Human-Computer Interaction Laboratory of Shanghai University with the purpose of research on human-like visually-rich document understanding.

Note: The DocTrack dataset should only be used for non-commercial research purposes. For any person/institution/company working on this direction, please contact us for a commercial license.

Description

DocTrack contains 539 images along with their eye-tracking order annotations. The original images are collected from the FUNSD, SEABILL and Inforgraphic VQA datasets. For more details, please refer to our paper accepted by EMNLP2023(findings) DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading.

Citation

@misc{wang2023dc,
    title={DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading},
    author={Hao Wang, Qingxuan Wang, Yue Li, Changqing Wang, Chenhui Chu and Rui Wang},
    year={2023},
    eprint={2310.14802},
    archivePrefix={arXiv},
    primaryClass={cs.HC}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
infograph		infograph
structured		structured
weak		weak
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocTrack Dataset

Description

Citation

About

Releases

Packages

Contributors 2

License

hint-lab/doctrack

Folders and files

Latest commit

History

Repository files navigation

DocTrack Dataset

Description

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages