Novel protein functional analysis based on weighted & directed protein overlap network and adjusted entropy measurements

Yixiang Zhang, University of Nebraska - Lincoln


Protein functional analysis is one of the main branches of molecular biology research. During the past decade, several computational methods have been introduced to investigate protein function at a molecular level. In this thesis, we propose novel methods for both general protein function prediction and more specific functional analysis related to a particular disease. In particular, for general protein function prediction, we propose two new methods: 1) To build a weighted and directed protein overlap network using association analysis; 2) To use the adjusted neighbor-counting criterion which incorporates the information from the domain composition. In addition, we also conduct specific protein functional analysis focusing on Influenza A viruses (IAV) by proposing a novel entropy-based host-specific signature identification method that uses a similarity coefficient to incorporate amino acid substitution. We also provide a new approach to analyze chronological and geographical effects on signature identification based on the dynamic time warping algorithm. For general protein function prediction, our validation experiments show that the performances of the proposed methods are more reliable for different organisms with different protein domain composition properties. For specific functional analysis, both simulation and real data analysis showed the advantages of our new method in both identification sensitivity and specificity. We believe our methods provide reliable and widely applicable protein function predictions and signature identifications for the research community.^

