Retrieve The Reads And Fastq From Bam File
2
5
Entering edit mode
12.1 years ago
rehma.ar ▴ 290

hello everyone!

i am using this command

samtools view -bh -F 2 FILE.bam mt > out.bam

to get the discordant reads(one end aligns to mitochondria and the other to chromosome) and also those reads that align to (both ends) mitochondria only. out put file is a bam file. but when i want get fastq out of this bam file using command

bamToFastq -i out.bam -fq out.1.fq -fq2 out.2.fq

it shows that there are so many reads that are marked as pairs but the but the other pair is not in the bam file. the error looks like this

*WARNING: Query HWI-ST170_235:2:22:12110:179082#0 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping.

can anyone plz tell what,s going wrong.

samtools • 17k views
ADD COMMENT
4
Entering edit mode

I also had this error message, and the problem in my case was that my bam file wasn't sorted by read names. To fix the issue I did the following:

1. Sort your bam file by read names:

samtools sort -n myBamFile.bam myBamFile.sortedByName

Note: if you now examine your new bam file (samtools view myBamFile.sortedByName | less -S), you should see that it is sorted by read names and that read pairs are next to each other.

2. Run bamtofastq on your new bam file

bamToFastq -i myBamFile.sortedByName.bam -fq out.1.fq -fq2 out.2.fq
ADD REPLY
1
Entering edit mode

This answer is wrong and doesn't work. Don't use it.

ADD REPLY
0
Entering edit mode

Hi, I sorted my unmapped .bam files using samtools and got the my_file_unmapped.qsort files using the command $ samtools sort -n my_file_unmapped.bam -o my_file_unmapped.qsort -O BAM But, now when I am trying to run bedtools bamtofastq, it is returning similar warning, WARNING: Query is marked as paired, but its mate does not occur next to it in your BAM file.

ADD REPLY
0
Entering edit mode

I am getting this same warning message on namesorted BAM file. I think the issue has to do with multi-mapping reads. bedtools seems to be giving a warning for any multi-mapping read that isn't immediately adjacent to its paired read (like a multi-mapper next to another line with itself mapped to another genomic location). It does look like the reads are ending up in the FASTQ file when both pairs are present, but these warning messages are pretty annoying. I'm guessing this is a bug (/feature) in bedtools.

ADD REPLY
0
Entering edit mode

Not sure this has to do with multi-mapping reads. I've tried filtering out multimapping reads from bwa-generated bams (samtools view -bq 1 bam_file) but the issue recurs regardless.

ADD REPLY
0
Entering edit mode

I had this issue and it turned out to just be multimapped reads which produced the error.

ADD REPLY
0
Entering edit mode

Did it affect the bamtofastq output file? if so how did you solve the problem? Thanks

ADD REPLY
1
Entering edit mode
12.1 years ago
Ido Tamir 5.2k

bamToFastq wants paired end reads to be adjacent to each other. Otherwise either the rows in the fastq files don't match or the program would have to memoize all the reads it encountered until the pair closes or it would have to reread the file multiple times.

If you don't have each read followed by its pair (the bam file is sorted by name) because you have discordant pairs filtered then you have write your own program or find one that does not have this restrictions and produces either 2 unpaired fastq files or inserts some placeholder read where the pair is absent.

ADD COMMENT
0
Entering edit mode

thanks a lot for the suggestion, i have used Hydra to get fastq from bam. seems works perfectly.

ADD REPLY
1
Entering edit mode
12.1 years ago

samtools view -bh -F 2 FILE.bam mt > out.bam

I would worry about that getting all kinds of discordant reads, not just the ones you want.

You can filter for -F12 with samtools, Then filter with awk, or some script, for reads where either the mapped chromsoome or the mate chromosome is MT.

ADD COMMENT
0
Entering edit mode

but don't you think using "mt" would only get those discordant reads where either the mapped chromsoome or the mate chromosome is MT. i also need those reads where "$3 = MT" and "$7 eq =". i think this command would work perfectly. if you don't agree plz tell me how?

ADD REPLY

Login before adding your answer.

Traffic: 1960 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6