Computer Science and Engineering, Department of

 

Computer Science, Computer Engineering, and Bioinformatics: Conference and Workshop Papers

Date of this Version

2016

Document Type

Article

Citation

in Procs. ICPR 2016, Cancun, Mexico

Abstract

Algorithmic methods are demonstrated for information extraction from table header elements, including data categories and data hierarchies. The table headers are found with the Minimum Index Point Search algorithm. The header-path alignment and header completion algorithms yield database-ready table content and configuration statistics on a random sample of 400 diverse tables with ground truth and 1120 tables without ground truth from international statistical data sites.

Share

COinS