Different output for read length
1
0
Entering edit mode
6 months ago
marco.barr ▴ 150

Hello everyone, I'm calculating the length of reads from a BAM file to create plots later on. I haven't filtered out secondary and/or supplementary alignments, and I can't understand why I'm getting different results with two methods I'll write below. Whom should I trust? It's comforting, though, that the maximum read length is the same in both, but the minimum is different. I'm aware that different methods can use different approximations, but I'm curious and just want to understand. Also, because I'm writing pipelines, I want to be sure of what I'm doing. Thanks a lot for any advice.

samtools stats /home/sorted.bam | grep ^RL | cut -f 2- > /home/stats_RL.txt
RL 121 1
samtools view /home/sorted.bam | head -n 1000000 | cut -f 10 | perl -ne 'chomp;print length($_) . "\n"' | sort -n | uniq -c
1 50
samtools BAM • 869 views
ADD COMMENT
0
Entering edit mode

Post the data at pastebin.com and add a link here or you could a GitHub gist. You were suspended by biostars SPAM bot since you pasted links in quick succession. I have reinstated your account.

ADD REPLY
0
Entering edit mode

Thanks so much GenoMax, I'm sharing the link as you told me

ADD REPLY
0
Entering edit mode

I just published my link on paste, why can't I see my public paste on the site?

ADD REPLY
0
Entering edit mode
6 months ago

I supect your reads are hard clipped. Show us the first lines of

samtools view /home/sorted.bam |  cut -f 2,6,10 | head -n 20
ADD COMMENT
0
Entering edit mode

Here is my result, I hope you can see it.

https://gist.github.com/Macdot3/4650243a10540d8c06f728ffc16a136c

ADD REPLY
0
Entering edit mode

the first read as a length of 1747, i wonder how you can just find '50'

$ wget -qO - "https://gist.githubusercontent.com/Macdot3/4650243a10540d8c06f728ffc16a136c/raw/20adf8c4b64dbf57aa42f081d4b57153c1c1d2db/gistfile1.txt" | tr -s " "| cut -f 3 -d ' ' | perl -ne 'chomp;print length($_) . "\n"' | sort -n | uniq -c
      1 1581
      1 1688
      1 1747
      1 2080
      1 2176
      1 2188
      1 2320
      1 2494
      1 2809
      1 3972
      1 4544
      1 6107
      1 6834
      1 7678
      1 8272
      1 8609
      1 10047
      1 10149
      1 13826
      1 16197
ADD REPLY
0
Entering edit mode

very strange, I double checked the BAM file and it is the same one I used for your request. Are the commands I wrote above correct? How is it possible..

ADD REPLY

Login before adding your answer.

Traffic: 2758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6