Hi At present, I only have realign BAM files, not fastq files. Those BAM files contain the whole genome information, and what we need now is mitochondrial genes. So I want to ask if I can convert realign BAM file into fastq file and extract mitochondrial gene? I've tried the following:
samtools sort -n input.bam -o input.name.bam
bedtools bamtofastq -i input.name.bam \
-fq out.R1.fq \
-fq2 out.R2.fq
The results are as follows:
*****WARNING: Query D3NJ6HQ1:424:HA49WADXX:1:1103:21194:65285 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping.
*****WARNING: Query D3NJ6HQ1:424:HA49WADXX:1:1107:12296:11434 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping.
*****WARNING: Query D3NJ6HQ1:424:HA49WADXX:1:1113:14106:44374 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping.
But there are fastq files in the folder, and the file size is not small. I don't know how I should change, in each forum did not find the answer, I hope you give some suggestions, thank you!
it's just a warning. samtools cannot find some reads associated to a pair of reads. It can happen if it's a sub region of the bam , or some reads were removed.
I'm sorry. Do you mean to ignore the warning? But when I do bwa mem after conversion, it can't run. Is there any way to solve this problem?
Maybe try samtools fixmate first, and then bamtofastq?
Thank you! I've tried your method, but there is still warning.
Has there been any kind of filtering applied to the bam file that you want to convert?
Sorry, I don't quite understand what you mean. I used the above method.
Yes, but after initial alignment of the data to produce that bam file, has there been any filtering applied that might explain why certain mates are missing.
I use BWA and samtools software to get the initial bam file, and then use Picard (RG, mark duplicate, build BAM index) and gatk (I use BWA and sampools software to get the initial BAM file, and then use Picard (RG, mark duplicate, build BAM index) and gatk ( RealignerTargetCreator, IndelRealigner) to get the final BAM file. Now I want to convert the final BAM file to fastq file (the previous fastq file has been lost), so I used the above method. Other filters are not used.
You obviously lost some of the mates during your processing. My guess is that one of the steps you've used filtered out one of the mates but kept the other one. Without having the whole code it's impossible to know where they got lost. And since it seems you don't have any of the previous BAMs you will have to go with fixing the mate pairs as commented below.