Estimating viral haplotypes in a population using k-mer counting

Raunaq Malhotra, Shruthi Prabhakara, Mary Poss, Raj Acharya

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations


Viral haplotype estimation in a population is an important problem in virology. Viruses undergo a high number of mutations and recombinations during replication for their survival in host cells and exist as a population of closely related genetic variants. Due to this, estimating the number of haplotypes and their relative frequencies in the population becomes a challenging task. The usage of a sequenced reference genome has its limitations due to the high mutational rates in viruses. We propose a method for estimating viral haplotypes based only on the counts of k-mers present in the viral population without using the reference genome. We compute k-mer pairs that are related to each other by one mutation, and compute a minimal set of viral haplotypes that explain the whole population based on these k-mer pairs. We compare our method to the software ShoRAH (which uses a reference genome) on simulated dataset and obtained comparable results, even without using a reference genome.

Original languageEnglish (US)
Title of host publicationPattern Recognition in Bioinformatics - 8th IAPR International Conference, PRIB 2013, Proceedings
PublisherSpringer Verlag
Number of pages12
ISBN (Print)9783642391583
StatePublished - 2013
Event8th IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2013 - Nice, France
Duration: Jun 17 2013Jun 20 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7986 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other8th IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2013

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Estimating viral haplotypes in a population using k-mer counting'. Together they form a unique fingerprint.

Cite this