Horizontal gene transfer and innovation in genome evolution

Date of Completion

January 2004


Biology, Genetics|Biology, Microbiology




Horizontal gene transfer (HGT) plays an important role in shaping microbial genomes. In addition to genes under sporadic selection, HGT also affects housekeeping genes, genes involved in information processing, even ribosomal RNA encoding genes. Assessing the frequency and impact of HGT remains a major challenge in understanding microbial evolution. To address this challenge, new visualization tools to depict mosaicism of microbial genomes were developed. Initially, maximum likelihood, Bayesian probability and bootstrap support mapping were developed to depict the mosaicism in four genomes. Application of these tools to different genomes indicated that in some instances there is no majority/plurality signal observed, and that genes from all functional categories are affected by HGT. The mapping tools were further improved to avoid the taxon-sampling problem and expanded to be applicable to five genomes. This method was applied to genome analyses of photosynthetic bacteria, and to study different gene contributions to a eukaryotic genome. Bipartition plotting was developed to allow analysis of more than five genomes at the same time. Application of bipartition plotting to genomes of two bacterial phyla, cyanobacteria and gamma-proteobacteria, indicated that different classes of bacteria are affected by HGT to a different extent. ^ Genes are not the smallest unit of HGT—they themselves could be mosaic. The extent of intra-gene mosaicism of ribosomal RNA, a widely used molecular marker in microbial taxonomy, was investigated. ^ In light of a widespread HGT, it is difficult to reconstruct the evolutionary history of the three domains of life. Simulations of genes and organisms were performed using a simple model with a constant number of species and rates of speciation equal to the rates of extinction. The consequences of HGT for the concept of a most recent common ancestor of all living organisms are discussed. To study the evolutionary history of the three domains of life, high-quality datasets of orthologous genes that are widely represented in known life forms are needed. Whole-genome analyses were performed to find ancient gene duplication events, reconstruct their evolutionary histories and assess how much congruence among the evolutionary histories of those genes exists. ^