Computer Science and Engineering, Department of

 

First Advisor

Peter Z. Revesz

Date of this Version

12-2016

Document Type

Article

Citation

Ramanan, J (2016). Testing the Independence Hypothesis of Accepted Mutations for Pairs of Adjacent Amino Acids in Protein Sequences (Master's thesis).

Comments

A thesis presented to the faculty of the Graduate College at the University of Nebraska in partial fulfillment of Requirements for the Degree of Master of Science, Major: Computer Science, under the supervision of Peter Z. Revesz, Lincoln, Nebraska: December, 2016

Copyright (c) 2016 Jyotsna Ramanan

Abstract

Evolutionary studies usually assume that the genetic mutations are independent of each other. However, that does not imply that the observed mutations are independent of each other because it is possible that when a nucleotide is mutated, then it may be biologically beneficial if an adjacent nucleotide mutates too.

With a number of decoded genes currently available in various genome libraries and online databases, it is now possible to have a large-scale computer-based study to test whether the independence assumption holds for pairs of adjacent amino acids. Hence the independence question also arises for pairs of adjacent amino acids within proteins. The independence question can be tested by considering the evolution of proteins within a closely related sets of proteins, which are called protein families.

In this thesis, we test the independence hypothesis for three protein families from the PFAM library, which is a publicly available online database that records a growing number of protein families. For each protein family, we construct a hypothetical common ancestor, or consensus sequence. We compare the hypothetical common ancestor of a protein family with each of the descendant protein sequences in the family to test where the mutations occurred during evolution. The comparison yields actual probabilities for each pair of amino acids changing into another pair of amino acids. By comparing the actual probabilities with the theoretical probabilities under the independence assumption, we identify anomalies that indicate that the independence assumption does not hold for many pairs of amino acids.

Adviser: Peter Z. Revesz

Share

COinS