Dear all,
I am trying to analyse the quality of reads from a BAM file but the BAM file does not have a header so I get this error:
Rsamtools::scanBam('output/alignment/pacbio/ref/sx.subreads.bam')
Error in value[3L] : failed to open BamFile: SAM/BAM header missing or empty file: 'output/alignment/pacbio/ref/sx.subreads.bam'
Does anybody know how I can get round this, please?
Thanks.
Best wishes,
C.
Is this one of those new BAM files that SMRTanalysis generates as default output?
Yes. This is the subreads.bam file that the pacbio facility sent me.
Have you tried
reheader
insamtools
? http://www.htslib.org/doc/samtools.htmlHi Alex, I don't have the corresponding SAM file for the BAM file. I've tried to convert to SAM and then to BAM again but that did not add a header.
I think, but I'm not 100% certain, that you would just convert from BAM to SAM:
samtools view -h -o old.sam old.bam
. Then runsamtools reheader old.sam new.bam
to get a re-headered BAM.That didn't work. According to the doc:
" samtools reheader [-iP] in.header.sam in.bam
Replace the header in in.bam with the header in in.header.sam. This command is much faster than replacing the header with a BAM→SAM→BAM conversion.
By default this command outputs the BAM or CRAM file to standard output (stdout), but for CRAM format files it has the option to perform an in-place edit, both reading and writing to the same file. No validity checking is performed on the header, nor that it is suitable to use with the sequence data itself. "
So I did:
samtools reheader old.sam old.bam > new.bam
I get the new.bam but it still has no header. The .sam file also has no header when I convert from BAM to SAM.
These BAM files should be in PacBio's BAM format. It appears that sequencing facility that gave you the files may not have generated them properly. You could ask them to regenerate the files.
Hi genomax,
I believe that my BAM file is an unaligned BAM file that contains reads and metadata without the reads having been aligned to the reference sequence I provided them with. That's fine, I can align it myself. So, do you think that even unaligned BAM files should have a header?
I have mostly worked with the older *.h5 PacBio data (we gave up on our RSII) so don't have direct experience with new BAM data (we are waiting on getting access to a local sequel).
PacBio BAM format spec seems to indicate that
usual
headers should be present in those files but clearly your file does not have them.I suggest that you also contact PacBio tech support directly with this question.
Hi,
My bad, my BAM does have a header. I can see it with this command:
The problem still persists with Rsamtools though:
But the file is there:
dir('output/alignment/pacbio/ref/')
[1] "sx.arrow" "sx.fasta" "sx.subreads.bam" "sx.subreads.sam" "sx2.subreads.bam"
And is definitely not empty:
file.size('output/alignment/pacbio/ref/sx.subreads.bam') [1] 5865760774
Do you need to sort and/or index the file?
Follow the directions on this page to convert the PacBio BAM file to fastq reads (see section on subread bam to fastq/a). Then realign (or do whatever you want to do) with the reads.
Hi genomax,
The BAM file I've got contains information about pulse width and inter-pulse width. If I convert this to FASTQ and then align, am I not going to lose this information? I would have liked to extract the pulse information from the BAM file to plot it for some reads.