Entering edit mode
8.2 years ago
izzy.yichao.cai
▴
180
Hi all, I am trying to convert a pair-end bam file to bedpe.
I use the code below: (Inspired by James)
samtools sort -@ 16 -n in.bam in.sorted
samtools fixmate in.sorted.bam in.sorted.fixed.bam
samtools view -@ 16 -bf 0x2 in.sorted.fixed.bam | bedtools bamtobed -i stdin -bedpe > in.sorted.fixed.bedpe
I use a bash script to run all the commands. After running the code, the bedpe file generated is empty while there is no warning or error reported during execution.
Using samtools view to check :
samtools view -H in.sorted.fixed.bam
The in.sorted.fixed.bam is sorted: @HD VN:1.0 SO:queryname
Does anyone get a hint of what would be the problem?
Thanks a lot!!!
Does this really work:
-i stdin
? Shouldn't it be-i -
or-i /dev/stdin
?This command is noted on the bedtools usage page. It describes the same situation that I am in, so that I use this command.
Also, in another post James's suggestion actually worked out, which are the same command that I am using.
Some people have problems with
-i stdin
and it could be related to which version you're using, or even your OS. You can direct yoursamtools view
output to a file to see if it's already empty then, or if the problem occurs after the pipe..Hi I try to direct samtools view output to file. The output is not empty, but is small(4.0K compared to bam 32G). Then I use bedtools, again I got the empty bedpe.
I think the problem might not be in bedtools version. I am using 2.26.0 now, which should be the latest(The bedtools usage page link I past on the previous reply is also 2.26.0).
Is it normal that I got such a small output of samtools view? It took quite some time(definitely over 20 minute, I ran it on server) to run while the file size is very small(4.0K).
The problem might be with the samtools conversion, I have never tried to combine extracting reads and bam-sam conversions into one, did you try separating the commands i.e.
samtools view -@ 16 -f 0x2 in.sorted.fixed.bam >pe.bam; samtools view -@ 16 -o out.sam pe.bam
Yeah. Actually, I found that the reads with tag '0x2' is very few or even none. That is the reason why I got empty bedpe using the old command.
Then I try bedtools bamtobed -i in.sorted.fixed.bam -bedpe > in.sorted.fixed.bedpe
The output bedpe file looks okay but the total number of the lines is quite small(around 0.5 million lines compared to the reads number which is definitely larger around billion, I think). I am trying to figure out whether this number is reasonable.
Looking at the result bedpe file, there are good lines that the pair is mapped and the distance looks ok. Then what is the meaning of extracting the read with tag '0x2'? The usage claimed that it can extract "properly-paired" reads. But not using the tag-extract command looks ok, then what is the point of it?