Hi,
I extracted mitochondrial reads from a bam file and remapped them to a reference genome through the following python function (remap2mt). my goal is to take this bam file (remapped) and find the depth of coverage for that bam file (samtools depth -r chrM remapped.bam
) . This requires the bam file to be sorted, which is why I include that at the bottom of this function, however it returns the following error that prevents any depths from being outputted. Can anyone tell me why the error is occurring? I took a look a the header of the remapped bam and got
@SQ SN:gi|251831106|ref|NC_012920.1| LN:16569
@PG ID:bwa PN:bwa VN:0.7.10-r789 CL:bwa mem rCRS.fasta ID18_Proband_exome_L001_001_mtExtract_1.fq ID18_Proband_exome_L001_001_mtExtract_2.fq
THE ERROR: (occurs when I call samtools depth on the sorted remapped bam)
[bam_pileup_core] the input is not sorted (reads out of order)
[bam_plp_destroy] memory leak: 1. Continue anyway.
THE FUNCTION:
def remap2mt(bam,path2currentDirectory):
#remap to the mtGenome (fasta file)
newFastq1 = bam[:-9] + "_1.fq" #strings holding names of the fastq files
newFastq2 = bam[:-9] + "_2.fq"
command = "samtools sort -n " + bam + " " + bam[:-9]+ ".qsort" #bam must be sorted before conversion to fq
print command
job = shlex.split(command)
stdout, stderr = sp.Popen(job,stdout=sp.PIPE).communicate() #execute sort command and send output to stdout
command = "bedtools bamtofastq -i " + bam[:-9]+ ".qsort.bam" + " -fq " + newFastq1 + " -fq2 " + newFastq2 #convert from bam to fastq
errFile = open('bamtofastqError', 'w+')
stdout,stderr = sp.Popen(shlex.split(command),stdout=sp.PIPE, stderr=errFile).communicate()
errFile.close()
#remap the mtDNA extract to the reference genome
command = "bwa mem rCRS.fasta " + newFastq1 +" " + newFastq2
remappedSam =str(bam[:-9] + "remap.sam")
args = shlex.split(command)
newPath = path2currentDirectory + "/" + remappedSam
outfileHandle = open(newPath, 'w+')
remapJob = sp.Popen(args, stdout = outfileHandle)
stdout, stderr = remapJob.communicate()
remappedBam = str(bam[:-9] + "remap.bam")
command = "samtools view -bT rCRS.fasta " + remappedSam + " -o " + remappedBam
args = shlex.split(command)
job = sp.Popen(args, stdout = sp.PIPE, stderr = sp.PIPE)
stdout, stderr = job.communicate()
print "Remap executed"
command = "samtools sort -n " + remappedBam + " " + remappedBam[:-4]+ ".qsort" #resorting
print command
job = shlex.split(command)
stdout, stderr = sp.Popen(job,stdout=sp.PIPE).communicate() #execute sort command and send output to stdout
return remappedBam
Hi Pierre!
I am using samtools depth, not mpileup. Does this change your answer?
It's the same thing: both invoke the mpileup command https://github.com/samtools/samtools/blob/develop/bam2depth.c#L181