Abstract
We propose a novel algorithm to simultaneously align three biological sequences with affine gap model and infer their common ancestral sequence. It applies the divide-and-conquer strategy to reduce the memory usage from O(n3) to O(n2). At the same time, it is based on dynamic programming and thus the optimal alignment is guaranteed. We implemented the algorithm and tested it extensively with both BAliBASE dataset and simulation data generated by Random Model of Sequence Evolution (ROSE). Compared with other popular multiple sequence alignment tools such as ClustalW and T-Coffee, our program produces not only better alignment, but also better ancestral sequence.
Original language | English (US) |
---|---|
Pages (from-to) | 192-204 |
Number of pages | 13 |
Journal | International Journal of Data Mining and Bioinformatics |
Volume | 3 |
Issue number | 2 |
DOIs | |
State | Published - 2009 |
All Science Journal Classification (ASJC) codes
- Information Systems
- General Biochemistry, Genetics and Molecular Biology
- Library and Information Sciences