Date of Completion
5-7-2015
Embargo Period
5-6-2015
Keywords
computational biology, bioinformatics, scaffolding, genome assembly, biomarker selection, deconvolution
Major Advisor
Ion Mandoiu
Associate Advisor
Craig Nelson
Associate Advisor
Yufeng Wu
Associate Advisor
Sanguthevar Rajasekaran
Associate Advisor
Alexander Zelikovsky
Field of Study
Computer Science and Engineering
Degree
Doctor of Philosophy
Open Access
Open Access
Abstract
The problem of interpreting biological data is often cast into a mathematical optimization framework where a large body of existing computational theory and practical techniques can be leveraged. While this strategy has been particularly successful in the bioinformatics domain, the massive datasets generated by high-throughput genomic technologies are challenging the scalability of even the most advanced mathematical optimization algorithms. Indeed, as the cost per base of of DNA sequencing has dropped precipitously, even outpacing Moore's law, the size of many bioinformatics problems has grown beyond the limit of existing methods, necessitating new algorithms. This effect is felt even more acutely in the burgeoning field of single cell biology where advances in microfluidics has rapidly increased the ability of bench biologists to capture and sequence the genomes and transcriptomes of hundreds of cells per experiment.
This dissertation presents novel computational method for answering three distinct biological questions: genome scaffolding, biomarker selection, and computational deconvolution of gene expression data from heterogeneous samples assisted by single-cell expression data. Each method strives to balance computational efficiency with the biological relevance of computed solutions.
Recommended Citation
Lindsay, James, "Scalable Optimization Algorithms for High-throughput Genomic Data" (2015). Doctoral Dissertations. 754.
https://digitalcommons.lib.uconn.edu/dissertations/754