The paper used 500 scanned Electronic Theses and Dissertation cover pages (i.e., front pages). The dataset contains several intermediate datasets, briefly discussed in the paper.