Hello,
I read a lot of topics about how to find uniquely mapped reads. I mapped my PE RNA-Seq data with Bowtie2 and BWA to the genome at different condition (-a
, -k
, --local
for Bowtie2 and BWA-mem and BWA-sampe) to analyze which one is better. For Bowtie2, I saw that some people used to filter the alignment with the flag -F 0x100 and others with the MAPQ 30. For BWA, the most filters the alignment by using the MAPQ 1.
However, I was doing some tests and I noticed that I can't use only the flag -F 0x100
, because the mapper attribute this flag to unmapped reads too. I conclude that to get only the unique reads I should use the -F 0x104
. Does anyone agree with me or noticed the same?
For example:
$ samtools view -bf 0x4 file.bam > unmapped.bam
$ samtools view -c unmapped.bam
15678472
$ samtools view -cF 0x100 unmapped.bam
15678472
I also filtered my reads with the -q 30
, but I realize that I had more uniquely reads using the flag -F 0x104
than the -q 30
. Could this be due to false positives? And how is the best way to filter the reads with this too aligners? I'm intended to use the flag -F 0x104
for both.
Thanks, Michele
I don't use Tophat because I'm studying trypanosomatids and they don't have introns. About RNA-Star I'm going to have a look. I saw some comparison studies saying that the best one for PE is BWA-mem. But I still don't know the best way to filter my alignment. I'm already not convinced about it, but I'm using the flag
-F 0x104
.