TY - GEN
T1 - HCV quasispecies assembly using network flows
AU - Westbrooks, Kelly
AU - Astrovskaya, Irina
AU - Campo, David
AU - Khudyakov, Yury
AU - Berman, Piotr
AU - Zelikovsky, Alex
PY - 2008
Y1 - 2008
N2 - Understanding how the genomes of viruses mutate and evolve within infected individuals is critically important in epidemiology. By exploiting knowledge of the forces that guide viral microevolution, researchers can design drugs and treatments that are effective against newly evolved strains. Therefore, it is critical to develop a method for typing the genomes of all of the variants of a virus (quasispecies) inside an infected individual cell. In this paper, we focus on sequence assembly of Hepatitis C Virus (HCV) based on 454 Lifesciences system that produces around 250K reads each 100-400 base long. We introduce several formulations of the quasispecies assembly problem and a measure of the assembly quality. We also propose a novel scalable assembling method for quasispecies based on a novel network flow formulation. Finally, we report the results of assembling 44 quasispecies from the 1700 bp long E1E2 region of HCV.
AB - Understanding how the genomes of viruses mutate and evolve within infected individuals is critically important in epidemiology. By exploiting knowledge of the forces that guide viral microevolution, researchers can design drugs and treatments that are effective against newly evolved strains. Therefore, it is critical to develop a method for typing the genomes of all of the variants of a virus (quasispecies) inside an infected individual cell. In this paper, we focus on sequence assembly of Hepatitis C Virus (HCV) based on 454 Lifesciences system that produces around 250K reads each 100-400 base long. We introduce several formulations of the quasispecies assembly problem and a measure of the assembly quality. We also propose a novel scalable assembling method for quasispecies based on a novel network flow formulation. Finally, we report the results of assembling 44 quasispecies from the 1700 bp long E1E2 region of HCV.
UR - http://www.scopus.com/inward/record.url?scp=49949112775&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=49949112775&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-79450-9_15
DO - 10.1007/978-3-540-79450-9_15
M3 - Conference contribution
AN - SCOPUS:49949112775
SN - 3540794492
SN - 9783540794493
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 159
EP - 170
BT - Bioinformatics Research and Applications - Fourth International Symposium, ISBRA 2008, Proceedings
T2 - 4th International Symposium on Bioinformatics Research and Applications, ISBRA 2008
Y2 - 6 May 2008 through 9 May 2008
ER -