Computer Science and Engineering, Department of

 

Date of this Version

2-15-2013

Citation

P. Z. Revesz, C. Assi, Data mining the functional characterizations of proteins to predict their cancer-relatedness, International Journal of Biology and Biomedical Engineering, 7 (1), 7-14, 2013.

Comments

OPEN ACCESS journal.

Christopher Assi, M.S. in Computer Science, University of Nebraska-Lincoln, 2012.

Abstract

This paper considers two types of protein data. First, data about protein function described in a number of ways, such as, GO terms and PFAM families. Second, data about whether individual proteins are experimentally associated with cancer by an anomalous elevation or lowering of their expressions within cancerous cells. We combine these two types of protein data and test whether the first type of data, that is, the functional descriptors, can predict the second type of data, that is, cancer-relatedness. By using data mining and machine learning, we derive a classifier algorithm that using only GO term and PFAM family descriptions of a protein can predict with over 73 percent accuracy whether it is associated with pancreatic cancer.

Share

COinS