Opportunities and challenges in long-read sequencing data analysis

Genome Biol. 2020 Feb 7;21(1):30. doi: 10.1186/s13059-020-1935-5.


Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. Dedicated analysis tools that take into account the characteristics of long-read data are thus required, but the fast pace of development of such tools can be overwhelming. To assist in the design and analysis of long-read sequencing projects, we review the current landscape of available tools and present an online interactive database, long-read-tools.org, to facilitate their browsing. We further focus on the principles of error correction, base modification detection, and long-read transcriptomics analysis and highlight the challenges that remain.

Keywords: Data analysis; Long-read sequencing; Oxford Nanopore; PacBio.

Publication types

  • Review

MeSH terms

  • Animals
  • Data Science / methods
  • Data Science / standards
  • Genomics / methods*
  • Genomics / standards
  • Humans
  • Nanopore Sequencing / methods*
  • Nanopore Sequencing / standards
  • Whole Genome Sequencing / methods*
  • Whole Genome Sequencing / standards