Computer Science and Engineering, Department of
Date of this Version
2016
Citation
in Procs. ICPR 2016, Cancun, Mexico
Abstract
Algorithmic methods are demonstrated for information extraction from table header elements, including data categories and data hierarchies. The table headers are found with the Minimum Index Point Search algorithm. The header-path alignment and header completion algorithms yield database-ready table content and configuration statistics on a random sample of 400 diverse tables with ground truth and 1120 tables without ground truth from international statistical data sites.
Included in
Computer Engineering Commons, Electrical and Computer Engineering Commons, Other Computer Sciences Commons