Date of Completion

8-2-2018

Embargo Period

8-1-2018

Advisors

Mukul S. Bansal, Ion I. Mandoiu, Yufeng Wu

Field of Study

Computer Science and Engineering

Degree

Master of Science

Open Access

Open Access

Abstract

Gene and sub-gene family evolution is usually represented in a framework where domain trees evolve inside one or more gene trees, each of which evolves inside a species tree. The Duplication-Transfer-Loss (DTL) reconciliation and the Domain-Gene-Species (DGS) reconciliation algorithms allow us to infer the evolutionary histories of a given set of species, genes, and protein domains. However, in the absence of biological data regarding the true evolutionary histories of these species, genes, and domains, we must rely on simulated data to validate the accuracy of these methods. Although numerous probabilistic simulation frameworks exist for gene family evolution, none of them account for certain important aspects of gene family evolution. Furthermore, no existing simulation framework can simulate sub-gene level events such as partial gene transfers and the evolution of domain families.

In this work, we modify an existing simulation framework to simulate both replacing and additive horizontal gene transfers, account for phylogenetic distance bias in choosing transfer recipients, and randomly select the location of gene birth in the species tree. In addition, we introduce the ability to simulate sub-gene level events such as partial gene transfers through the simulated evolution of protein domains within gene families.

To demonstrate the utility of our new simulation framework, we systematically evaluate the accuracy of DTL reconciliation on simulated datasets that contain both additive and replacing transfers. Our results from this simulation study indicate that DTL reconciliation, which assumes that all transfers are additive, is surprisingly robust to the presence of replacing transfers.

Major Advisor

Mukul S. Bansal

Share

COinS