Computational Methods for Transcript Assembly from RNA-SEQ Reads

Stefan Canzar, Liliana Florea

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Scopus citations


A major goal in bioinformatics is to identify the genes and their transcript variations, collectively defining the transcriptome of a cell or species. There are two main classes of transcript assembly methods: de novo, which assemble reads based solely on sequence overlap, and genome-based, which first align the reads to a reference genome and then assemble the overlapping alignments. The main classes of artifacts are redundancies resulted from incomplete merging of reads and contigs, fragmented transcripts, chimeric constructs, and collapsing of paralogs. The chapter describes general principles underlying current methods for genome-based transcriptome assembly. Genome-based methods allow for better resolution of repeat and paralogous sequences, as well as overlapping gene models, and offer higher sensitivity, particularly in capturing low-coverage transcripts. Transcript reconstruction methods and their mathematical foundations need to continually adapt to provide more accurate solutions and to adapt to the characteristics and biases of the evolving sequencing technologies.

Original languageEnglish (US)
Title of host publicationComputational Methods for Next Generation Sequencing Data Analysis
Number of pages24
ISBN (Electronic)9781119272182
ISBN (Print)9781118169483
StatePublished - Sep 6 2016

All Science Journal Classification (ASJC) codes

  • General Computer Science


Dive into the research topics of 'Computational Methods for Transcript Assembly from RNA-SEQ Reads'. Together they form a unique fingerprint.

Cite this