Computer Science and Engineering, Department of

 

Date of this Version

2016

Citation

in Procs. ICPR 2016, Cancun, Mexico

Abstract

Algorithmic methods are demonstrated for information extraction from table header elements, including data categories and data hierarchies. The table headers are found with the Minimum Index Point Search algorithm. The header-path alignment and header completion algorithms yield database-ready table content and configuration statistics on a random sample of 400 diverse tables with ground truth and 1120 tables without ground truth from international statistical data sites.

Share

COinS