Not consistent result between SAM filtering and bowtie
0
1
Entering edit mode
8.4 years ago
Picasa ▴ 650

Hello,

I have a list of contaminants that I want to filter out from my paired end:

bowtie2 -x contaminants -1 pair1.fastq -2 pair2.fastq -S out.sam

Now I want to extract only unmapped reads from the .sam file but I want to keep only paired end.

So if read1 map somewhere and read2 not, I want to discard these PE.

samtools view -f 4 -bu input.bam | samtools view -f 8 -bu - | java -jar  picard-tools/picard.jar  SamToFastq  I=/dev/stdin F=out1.fq.gz F2=out2.fq.gz  FU=unpaired.fq.gz

I got a problem, the output of bowtie 2 is :

5317475 reads; of these:
15317475 (100.00%) were paired; of these:
15155323 (98.94%) aligned concordantly 0 times
155208 (1.01%) aligned concordantly exactly 1 time
6944 (0.05%) aligned concordantly >1 times
----
15155323 pairs aligned concordantly 0 times; of these:
95593 (0.63%) aligned discordantly 1 time
----
15059730 pairs aligned 0 times concordantly or discordantly; of these:
30119460 mates make up the pairs; of these:
29853788 (99.12%) aligned 0 times
195884 (0.65%) aligned exactly 1 time
69788 (0.23%) aligned >1 times
2.55% overall alignment rate

which mean I should have 15155323 PE right ?

But when I do :

grep '@' out1.fastq | wc -l

I got : 14798490 PE

1) What's wrong with my command ?

2) Moreover my unpaired.fq.gz is empty

sam bowtie • 2.0k views
ADD COMMENT
0
Entering edit mode

Hello picasa1983!

It appears that your post has been cross-posted to another site: http://seqanswers.com/forums/showthread.php?t=70248

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
0
Entering edit mode

Sorry but I try to increase my chance of answers. This problem annoy me and it seems like nobody got an answer.

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 1793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6