Surpsrisingly short reads in illumina 1.9 Miseq runs (2X300)
1
0
Entering edit mode
3 months ago

I have data generated from illumina Miseq 1.9 for 24 samples, which are paired ends (300X2). The instrument reads a 500-600 read fragment and run a paired end protocol. Looking at Fastq files multiqc report, I realized that R1 and R2 of samples have a length of 175-260 bp (See image 1) , not even 300 bp. I do know how sequencing work, and I am only surprised from those samples lying below 200 for instance ! by looming at the Fastqc report, those samples with small average read length (175 or 200) have humbs around the middle (in the sequence length distribution (See Image 2) I also observed that those reads with the shorterst length have more abundance (higher seq. depth) and also better phred score quality. Can one trust these runs ? Any comments ?

Thanks

NGS • 389 views
ADD COMMENT
1
Entering edit mode
3 months ago
GenoMax 145k

not even 300 bp.

Not surprising. Libraries can have inserts shorter than the length of sequencing. There is a distribution of insert sizes in any given library. Once sequencing runs out of real sequence, sequencer goes on to read into the the adapter at 3'end. If experimental people expected these inserts to be longer then they seem to have missed the mark.

I also observed that those reads with the shorterst length have more abundance

Shorter inserts always tend to cluster better. Probably partly due to physics since the initial binding/clustering is a random process. That is the reason one needs to get rid of adapter dimers etc since they will out compete real library fragments.

Can one trust these runs ?

Why not? Trim the data, stat analyzing the reads. We don't know what this experiment is about but as long as there was no other issue (hardware/software/kit) with the sequencing the data should be fine.

ADD COMMENT
0
Entering edit mode

Thanks you for your reply

ADD REPLY

Login before adding your answer.

Traffic: 907 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6