samtools sam-to-bam conversion => bamfile RSeQC tool bam_stats.py cannot use. ValueError: file has no sequences defined (mode='rb')
1
0
Entering edit mode
5.9 years ago
RNAseqer ▴ 280

Is there a known issue with samtools conversion of sam to bam files using this commandline:

$ samtools view testoutput2_paired_reads.sam > testoutput2_paired_reads.bam

RSeQC's bam_stat.py program works with the original samfile, but not the bam file I generated with samtools. Since I'd like an intact bam file going forward I was hoping to learn whats wrong with my bam file. Because when I go to use the most recent version of RSeQC on this bam file

$ bam_stat.py  -i testoutput2_paired_reads.bam

I get the following message:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/qcmodule/SAM.py", line 2311, in __init__
    self.samfile = pysam.Samfile(inputFile,'rb')
  File "pysam/libcalignmentfile.pyx", line 736, in pysam.libcalignmentfile.AlignmentFile.__cinit__
  File "pysam/libcalignmentfile.pyx", line 985, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False

Is my sam to bam conversion commandline missing some option resulting in incomplete bam file output? Or is this a known conflict between samtool generated bam files and RSeQC. If it matters my .sam file was generated by HISAT2.

Here are the first few lines of that .bam file:

SRR350718.8385  16      16      69845176        60      103M    *       0       0       TTATTGTGGGAATGAACTGAGATGAGGCATGTCAGTCAGAGGGCCTTACACAACGGGAGCACTCGGTGGAAGTGGCAGTGGGTTTTCTTTGTATCTGGAGACT BEDB@EAC5362-,?<:=0C?8A>CBDC.>.EEGFFHHGCEGEHEHHFDDDDD8CEDDFG(FFHHHHHGHHHFBGGFFDHHGHEHGHHFHHFHGHHHHHHEHH AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:103        YT:Z:UU NH:i:1
SRR350718.8387  0       13      106491737       60      103M    *       0       0       CTGGGGGGAGGGGGATCTGACCTACCTTTTATAGATAAGGATTCTTTAATAACAACGATGATGATGGAATTACAGCCTGTTTGACCTTTGGCTTTTCAACTTT HHHHHHHHFHEHHHEFHEFHFHHHHFHHHHFEEHFHFHFFEEFEEEBDDBHFHHEHHHHFHHFEBDFEEFFEBEECDEEEE9<EDEDEE.?9=DDB@CC?==A AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:103        YT:Z:UU NH:i:1
SRR350718.8389  0       X       154443297       60      104M    *       0       0       GCCAAGCAGAAGGAGGTGGGAAAACGGACCCAAACCCCAGTGTGCCCTGCCCCATGCCTTTCCTTTAGTGGTGGGAAACCCTTATCTTGCAAAGTGAATGTGTC        HHHHHHHHHGHGHHHH@HHHGGHGFDHFHHGGHHFHHHFFEFBHCDHEFHGGIGGHHHHHHHHHHHHHHHG<GFH?GGCFFFFF8FHFBHEFA3BEBEB=CCB7        AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:104

Sorry for the eyesore of test and output, but can you spot whats wrong?

Grateful for any light you can shed on the matter!

samtools bam RSeQC bam_stats.py (mode='rb') • 2.9k views
ADD COMMENT
1
Entering edit mode

Which version of samtools are you using? Anyway, I believe you need samtools view -h to include the headers into the bam.

ADD REPLY
0
Entering edit mode

That fixed it! Thanks!

ADD REPLY
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Thank you!

ADD REPLY
0
Entering edit mode

Thank you for that, I only just began using biostars three days ago so I missed that (silly oversight on my part) and will be sure to do this in the future!

ADD REPLY
1
Entering edit mode

Is there no header in your SAM (and thus BAM) file?

ADD REPLY
0
Entering edit mode

It turns out that was the case. view -h was what i needed.

ADD REPLY
0
Entering edit mode
5.9 years ago

Try this:

samtools view testoutput2_paired_reads.sam -o testoutput2_paired_reads.bam
ADD COMMENT

Login before adding your answer.

Traffic: 2084 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6