Question

Why is DNA split into many fragments before being sequenced by an Illumina sequencer?

0

Entering edit mode

10.5 years ago

John Smith ▴ 320

I am new to bioinformatics. Currently, I am working with software that generates artificial FASTQ files from a given reference genome. These FASTQ files are supposed to resemble the reads that would come out of a modern next gen sequencer. Moreover, the user can customize the length of reads (the default is 76bp). What I have understood so far, is that each of the reads in a FASTQ file is a sequenced fragment of the genome. Why is that the genome needs to be fragmented into short pieces in order to be sequenced by a modern sequencer?

dna RNA-Seq illumina fastq • 8.0k views

ADD COMMENT • link updated 10.5 years ago by Ashutosh Pandey 12k • written 10.5 years ago by John Smith ▴ 320

0

Entering edit mode

You may also be interested in this tool: https://github.com/lh3/wgsim

ADD REPLY • link updated 4.9 years ago by Ram 44k • written 10.5 years ago by Biomonika (Noolean) 3.2k

Ram · Accepted Answer · 2014-06-11

Sequencing can only be performed for fairly short strands (100 to 5000 basepairs) and longer sequences must be subdivided into smaller fragments to sequence them.

The main reason being that the quality of the base (confidence with which a photo or chemical signal can be interpreted into a nucleotide identity) decreases with length and after a point it becomes hard to identify the actual base or nucleotide call.

See these links: