A divide-and-conquer implementation of three sequence alignment and ancestor inference

Feng Yue, Jijun Tang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

In this paper, we present an algorithm to simultaneously align three biological sequences with affine gap model and infer their common ancestral sequence. Our algorithm can be further extended to perform tree alignment for more sequences, and eventually unify the two procedures of phylogenetic reconstruction and sequence alignment. The novelty of our algorithm is: it applies the divide-and-conquer strategy so that the memory usage is reduced from O(n3) to O(n2), while at the same time, it is based on dynamic programming and optimal alignment is guaranteed. Traditionally, three sequence alignment is limited by the huge demand of memory space and can only handle sequences less than two hundred characters long. With the new improved algorithm, we can produce the optimal alignment of sequences of several thousand characters long. We implemented our algorithm as a C program package MSAM. It has been extensively tested with BAliBASE, a real manually refined multiple sequence alignment database, as well as simulated datasets generated by Rose (Random Model of Sequence Evolution). We compared our results with those of other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. The experiment shows that MSAM produces not only better alignment, but also better ancestral sequence. The software can be downloaded for free at http://www.cse.sc.edu/phylo/MSAM.html.

Original languageEnglish (US)
Title of host publicationProceedings - 2007 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2007
Pages143-150
Number of pages8
DOIs
StatePublished - 2007
Event2007 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2007 - Fremont, CA, United States
Duration: Nov 2 2007Nov 4 2007

Publication series

NameProceedings - 2007 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2007

Other

Other2007 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2007
Country/TerritoryUnited States
CityFremont, CA
Period11/2/0711/4/07

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • General Computer Science
  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'A divide-and-conquer implementation of three sequence alignment and ancestor inference'. Together they form a unique fingerprint.

Cite this