Computer Science and Engineering, Department of


First Advisor

Myra Cohen

Date of this Version



Mikaela Cashman. Using Software Testing Techniques to Infer Biological Models. Master’s thesis, University of Nebraska-Lincoln, 2016.


A THESIS Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Master of Science, Major: Computer Science, Under the Supervision of Professor Myra B. Cohen. Lincoln, Nebraska: May, 2016

Copyright © 2016 Mikaela Cashman


Years of research in software testing has given us novel ways to reason about and test the behavior of complex software systems that contain hundreds of thousands of lines of code. Many of these techniques have been inspired by nature such as genetic algorithms, swarm intelligence, and ant colony optimization. However, they use a unidirectional analogy – taking from nature without giving back.

In this thesis we invert this view and ask if we can utilize techniques from testing and modeling of highly-configurable software systems to aid in the emerging field of systems biology which aims to model and predict the behavior of biological organisms. Like configurable systems, the underlying source code (metabolic model) contains both common and variable code elements (reactions) that are executed only under particular configurations (environmental conditions), and these directly impact an organism’s observable behavior. We propose the use of sampling, classification, and modeling techniques commonly used in software testing and combine them into a process called BioSIMP which can lead to simplified models and biological predictions.

We perform two case studies, the first of which explores and evaluates different classification techniques to infer influential factors in microbial organisms. We then compare several sampling methods to limit the number of experiments required in the laboratory. We show that we can reduce testing by more than two thirds without negatively impacting the quality of our models. Finally, we perform an end-to-end case study on BioSIMP using both laboratory and simulation data and show that we can find influencing environmental factors in two microbial organisms, some of which were previously unknown to biologists.

Our findings suggest that the configurable-software analogy holds, and we can identify the variable and common regions of reactions that change with respect to the environment.

Advisor: Myra B. Cohen