Information mining algorithms for bioinformatics problems

Date of Completion

January 2008


Biology, Bioinformatics




The simple world of algorithms can be applied to various problems all around us. With significant growth of bio-molecular sequence data in the last decade the need for algorithms to extract patterns and meaningful information from such data has been felt strongly. Multiple sequence alignment is an area in biology that has been studied extensively and many algorithms have been suggested in literature for the problem. Alignment of sequences, in order to determine regions of common descent, has also been an important area of research as it helps scientist discover the evolution of species. We employ sampling to improve the time as well as accuracy for multiple sequence alignment. ^ With growing publications in the area of bioinformatics in particular and science in general, it has become almost impossible to go through every single paper available on the internet on a research subject. Several researchers are putting in a lot of effort into finding new ways for document summarization so that the most relevant information in a document can be filtered out as a summary and can thus reduce the time and effort required to go through the entire document. ^ As the lower bound for computation is being met for various algorithms in bioinformatics, to further expedite the computing on large data sets, parallelization has become imperative. New multiprocessor architectures like the Cell Broadband Engine have the potential to do extensive calculations and act as mini-supercomputers. Other applications for these include onboard aircraft fault diagnosis and prognosis. ^ In this research work we aim to propose novel algorithms as well as implementations to address these problems in the field of bioinformatics. ^