Computer Science and Engineering, Department of


Date of this Version



G. Nagy, D. Lopresti, M. Krishnamoorthy, Y. Lin, S. Mehta, and S. Seth,"A Nonparametric Classifier for Unsegmented Text", Document Recognition and Retieval XI (part of IS&T/SPIE Int. Symposium on Electronic Imaging 2004), IS&T/SPIE, 6 pp., January 2004


Symbolic Indirect Correlation (SIC) is a new classification method for unsegmented patterns. SIC requires two levels of comparisons. First, the feature sequences from an unknown query signal and a known multi-pattern reference signal are matched. Then, the order of the matched features is compared with the order of matches between every lexicon symbolstring and the reference string in the lexical domain. The query is classified according to the best matching lexicon string in the second comparison. Accuracy increases as classified feature-and-symbol strings are added to the reference string.