Libraries at University of Nebraska-Lincoln


Date of this Version

Fall 10-8-2020


Heterogenous and voluminous unstructured data is produced from various sources like emails, social media tweets, reviews, videos, audio, images, PDFs, scanned documents, etc. Organizations need to store this wide range of unstructured data for more and longer periods so that they can examine information all the more profoundly to make a better decision and extracting useful insights. Manual processing of such unstructured data is always a challenging, time-consuming, and expensive task for any organization. Automating unstructured document processing using Optical Character Recognition (OCR) and Robotics Process Automation (RPA), seems to have limitations, as those techniques are driven by rules or templates. It needs to define the template or rules for every new input, which limits the use of rule or templates based techniques for unstructured document processing. These limitation demands to develop a tool which can be able to process the unstructured documents using Artificial Intelligence techniques. This bibliometric survey on Cognitive Document Processing reveals the mentioned facts about unstructured data processing challenges. This survey is performed on the Scopus database’s scientific documents. Various tools such as Microsoft Excel, Sciencescape, VOSviewer, Leximancer, and Gephi for drawing network data analysis diagrams are used. The study revealed that the largest number of publications on Cognitive Document Processing had been explored very recently. It is observed that universities/institutions in India are leading in the research studies focusing on this research topic.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.