Is there a known issue with samtools conversion of sam to bam files using this commandline:
$ samtools view testoutput2_paired_reads.sam > testoutput2_paired_reads.bam
RSeQC's bam_stat.py program works with the original samfile, but not the bam file I generated with samtools. Since I'd like an intact bam file going forward I was hoping to learn whats wrong with my bam file. Because when I go to use the most recent version of RSeQC on this bam file
$ bam_stat.py -i testoutput2_paired_reads.bam
I get the following message:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/qcmodule/SAM.py", line 2311, in __init__
self.samfile = pysam.Samfile(inputFile,'rb')
File "pysam/libcalignmentfile.pyx", line 736, in pysam.libcalignmentfile.AlignmentFile.__cinit__
File "pysam/libcalignmentfile.pyx", line 985, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False
Is my sam to bam conversion commandline missing some option resulting in incomplete bam file output? Or is this a known conflict between samtool generated bam files and RSeQC. If it matters my .sam file was generated by HISAT2.
Here are the first few lines of that .bam file:
SRR350718.8385 16 16 69845176 60 103M * 0 0 TTATTGTGGGAATGAACTGAGATGAGGCATGTCAGTCAGAGGGCCTTACACAACGGGAGCACTCGGTGGAAGTGGCAGTGGGTTTTCTTTGTATCTGGAGACT BEDB@EAC5362-,?<:=0C?8A>CBDC.>.EEGFFHHGCEGEHEHHFDDDDD8CEDDFG(FFHHHHHGHHHFBGGFFDHHGHEHGHHFHHFHGHHHHHHEHH AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:103 YT:Z:UU NH:i:1
SRR350718.8387 0 13 106491737 60 103M * 0 0 CTGGGGGGAGGGGGATCTGACCTACCTTTTATAGATAAGGATTCTTTAATAACAACGATGATGATGGAATTACAGCCTGTTTGACCTTTGGCTTTTCAACTTT HHHHHHHHFHEHHHEFHEFHFHHHHFHHHHFEEHFHFHFFEEFEEEBDDBHFHHEHHHHFHHFEBDFEEFFEBEECDEEEE9<EDEDEE.?9=DDB@CC?==A AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:103 YT:Z:UU NH:i:1
SRR350718.8389 0 X 154443297 60 104M * 0 0 GCCAAGCAGAAGGAGGTGGGAAAACGGACCCAAACCCCAGTGTGCCCTGCCCCATGCCTTTCCTTTAGTGGTGGGAAACCCTTATCTTGCAAAGTGAATGTGTC HHHHHHHHHGHGHHHH@HHHGGHGFDHFHHGGHHFHHHFFEFBHCDHEFHGGIGGHHHHHHHHHHHHHHHG<GFH?GGCFFFFF8FHFBHEFA3BEBEB=CCB7 AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:104
Sorry for the eyesore of test and output, but can you spot whats wrong?
Grateful for any light you can shed on the matter!
Which version of samtools are you using? Anyway, I believe you need
samtools view -h
to include the headers into the bam.That fixed it! Thanks!
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!
Thank you for that, I only just began using biostars three days ago so I missed that (silly oversight on my part) and will be sure to do this in the future!
Is there no header in your SAM (and thus BAM) file?
It turns out that was the case. view -h was what i needed.