How many reads should I expect for paired end reads when coverage = 30 million?
2
3
Entering edit mode
9.1 years ago

Hello,

My lab ordered paired end sequencing, and we received a reported coverage of 30 million reads per sample.

Just to confirm - this means that there are 30 million reads across both directions? So, 15 mil per end in the paired end, so after alignment with TopHat2/counting with htseq-count, I should expect there to be about 15 million reads (i.e., read-pairs) for each sample?

Or should I expect to see 30 million reads, representing 30 million pairs/60 million total ends?

Thank you for the sanity check!

RNA-Seq • 9.0k views
ADD COMMENT
0
Entering edit mode

Coverage usually has a different meaning.

ADD REPLY
0
Entering edit mode

Oh, excuse me! I meant that the total number of sequence reads = 30 million. It was unclear from the sequencing company if this meant 30 mil per direction, or 30 mil altogether.

ADD REPLY
0
Entering edit mode

It's a really important question to ask up front when you get contract sequencing done. "Is that reads, or read pairs?" - as obviously the latter is half the former.

ADD REPLY
0
Entering edit mode

It's most likely 15M per end, which is on the low end. As reads can be of varying lengths, I prefer to measure and be quoted by G of bases.

ADD REPLY
3
Entering edit mode
9.1 years ago

It should be 15 million per end

ADD COMMENT
6
Entering edit mode

As this post is warmed up in 2018, I strongly argue against the word should in this context. Rather than that, call the facility and ask, making sure that everyone is on the same page. I have witnessed so much confusion, even within our group where we typically know the vocabulary of each other, when talking about reads, coverage, depth, read number vs. fragment number etc.

ADD REPLY
1
Entering edit mode

Thanks! Actually, since we have a bit of lived experience since this post was first made, I can share our experience: indeed, there was a miscommunication with the facility - what we meant was thirty million reads for analysis, but sixty million total/paired end reads. We ended up with thirty million total, and fifteen million functional coverage. We later re-sequenced the data at the appropriate depth and the data made so, so much more sense. So - two votes for calling your facility and making sure everyone is on the same page!

ADD REPLY
2
Entering edit mode
6.3 years ago

The post is quite old, but I see some confusion here. The read numbers might be different from facility to facility. For example, here at CRG, if you order 30mln paired-end reads, you get 30 mln per each mate. And I think this approach makes more sense (especially in case of RNAseq) since paired end sequencing is performed by sequencing the same fragment, but from both ends, which doesn't add up to expression levels.

ADD COMMENT
0
Entering edit mode

Indeed, that was the case! (see above) Fortunately we were able to re-sequence this dataset at an appropriate depth.

ADD REPLY
0
Entering edit mode

good to hear a happy end :)

ADD REPLY

Login before adding your answer.

Traffic: 1832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6