Entering edit mode
9.7 years ago
Yuu
▴
10
I have PE125 sequence data and want to calculate the average read length of my reads.
- average length = total yield base/total reads
- total yield base =
samtools depth -q 0 -Q 0 my.bam | awk '{total+=\$3};END{print total}'
- total reads also from bam file
The result show the average length of my reads is little smaller than 125, is this because that "samtools depth" can't calculate the inserted base?
Thanks for your kind answer, I try your method to calculate the average length (result is 124.946, when I use
samtools depth
the result is 124.31), but it also not 125.What could cause this? And I also want to know ifsamtools depth
can't calculate the inserted base?Add: thanks geek_y, yes, we should consider clipping reads, it's the reason
You should take care of
soft-clipping
andhard-clipping
.