Hierarchical Bayesian analysis of genetic diversity in geographically structured populations

Date of Completion

January 2005

Keywords

Biology, Biostatistics|Biology, Genetics|Statistics

Degree

Ph.D.

Abstract

Populations may become differentiated from one another as a result of genetic drift. The amounts and patterns of differentiation at neutral loci are determined by local population sizes, migration rates among populations, and mutation rates. We provide exact analytical expressions for the mean, variance and covariance of a stochastic model for hierarchically structured populations subject to migration, mutation, and drift. In addition to the expected correlation in allele frequencies among populations in the same geographical region, we demonstrate that there is a substantial correlation in allele frequencies among regions at the top level of the hierarchy. We propose a hierarchical Bayesian model for inference of Wright's F -statistics in a two-level hierarchy in which we estimate the among-region correlation in allele frequencies by substituting replication across loci for replication across time. We illustrate the approach through an analysis of human microsatellite data, revealing that approaches ignoring the among population correlation of allele frequencies underestimate the amount of genetic differentiation among major geographical population groups by approximately 50%, and we discuss the implications of these results for the use and interpretation of F-statistics in evolutionary studies. ^ Microsatellite loci are widely used for investigating patterns of genetic variation within and among populations. Those patterns are in turn determined by population sizes, migration rates, and mutation rates. We provide exact expressions for the first two moments of a stochastic model appropriate for studying microsatellite evolution with migration, mutation, and drift under the assumption that the range of allele sizes is bounded. Using these results we study the behavior of several measures related to Wright's F ST, including Slatkin's RST. Our analytical approximations show that familiar relationships between Nem and FST or R ST hold when migration and mutation rates are small. Using the exact expressions, our numerical results show that when migration and mutation rates are large, these relationships no longer hold. Our numerical results also show that the diversity measures most closely related to FST depend on mutation rates, mutational models(stepwise versus two-phase), migration rates, and population sizes. Surprisingly, RST is relatively insensitive to mutation rates and mutational models. (Abstract shortened by UMI.)^

Share

COinS