Hello there,
Is there a way to get as a bowtie output ONLY the reads that aligns in more then one region ? It would be the negative of -m
flag.
Thank you for your time. I do aprreciate it.
Hello there,
Is there a way to get as a bowtie output ONLY the reads that aligns in more then one region ? It would be the negative of -m
flag.
Thank you for your time. I do aprreciate it.
After bowtie run, you can fetch non uniquely mapped reads from SAM file. For example, if NH:1:X
is the flag for uniquely mapped reads in SAM file (bowtie output), just do
grep -v 'NH:1:X' <your SAM file>
HTH
Without the NH tag you could try something like that (!!!NOT TESTED!!!)
1.
samtools view -F 4 -f 64 file.bam | cut -f 1 | sort | uniq -c | awk '$1>1 {print $2}' > mate1_readsWithMultipleHits.ids
samtools view -F 4 -f 128 file.bam | cut -f 1 | sort | uniq -c | awk '$1>1 {print $2}' > mate2_readsWithMultipleHits.ids
2.
samtools view -f 64 file.bam | fgrep -w -f mate1_readsWithMultipleHits.ids > mapping_positions_of_mutiple_mapped_reads.sam
samtools view -f 128 file.bam | fgrep -w -f mate2_readsWithMultipleHits.ids >> mapping_positions_of_mutiple_mapped_reads.sam
This is what I want, extracting multiple mapped reads from bowtie alignment. I have tested for SE reads. Also you can extract unique mapped reads.
From Xianjun Dong's post, alternative, we can use
NH:i:x
tag, or theFLAG
tag for this purpose.
awk '{if (and($2,0x100)) print}'
# unique mapped
fgrep -w NH:i:1
# unique mapped
fgrep -vw NH:i:1
# multi mappedBut above solutions are not working for bowtie output.
However, this solution is not working for default bowtie param: -k 1
, the multiple mapped reads reported only once.
fgrep
is so powerful for this purpose. Thanks so much.
This is the log of bowtie with -m 1
, tells 1124533 unique reads, 208288 multiple reads.
$ cat align.m1.bowtie.log
# reads with at least one reported alignment: 1124533 (48.35%)
# reads that failed to align: 993099 (42.70%)
# reads with alignments suppressed due to -m: 208288 (8.96%)
$ samtools view -F 4 -c align.m1.sam
1124533
If I use -k 5
instead: (report up to <5> good alignments per read), 1332821 (1124533 + 208288) mapped reads were reported.
$ cat align.k5.bowtie.log
# reads processed: 2325920
# reads with at least one reported alignment: 1332821 (57.30%)
# reads that failed to align: 993099 (42.70%)
Reported 1569333 alignments
$ samtools view -F 4 -c align.k5.sam
1569333
Test, extract multiple read ids:
$ samtools view -F 4 align.k5.sam | cut -f 1 | sort | uniq -c | awk '$1>1 {print $2}' | wc
208288
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you for your time @Manu.
I could not figure out the right way/flag to fetch ONLY non uniquely mapped reads from a SAM file.
I cannot find this 'NH:1' flag on my files.
Any new insights ?
I think that if it is bowtie 1, this flag is not set and you need to process the file your self with a script, maybe using python and pysam library.