Cufflinks invalid BAM binary header
0
0
Entering edit mode
9.3 years ago

I get this warning when running cufflinks on my RNA-seq. Does not prevent me from doing my analysis, I' am just curious what it is. In short, I have paired end RNA-seq mapped with STAR (v2.4.1d) against GRCm38.

STAR \
  --outSAMtype BAM SortedByCoordinate \
  --runThreadN 8 \
  --outSAMstrandField intronMotif \
  --genomeDir ./GRCm38/star \
  --readFilesIn [fastqfiles] \
  --readFilesCommand zcat \
  --outFileNamePrefix Sample_het1.

Then quantify with cufflinks (v2.2.1).

cufflinks \
  -q \
  -p 8 \
  -o ./ \
  -b Mus_musculus.GRCm38.dna.primary_assembly.fa \
  -G Mus_musculus.GRCm38.79.gtf Sample_het1.Aligned.sortedByCoord.out.bam

I get the following message

[bam_header_read] invalid BAM binary header (this is not a BAM file)
RNA-Seq • 3.4k views
ADD COMMENT
0
Entering edit mode

Try samtools view to see if the header is present.

You can try to rehead the file using reheader

Check the syntax here or on the samtools page Bam Header Edit

ADD REPLY
0
Entering edit mode

Indeed there is, here are the first few lines:

@HD    VN:1.4    SO:coordinate
@SQ    SN:1    LN:195471971
@SQ    SN:10    LN:130694993
@SQ    SN:11    LN:122082543
@SQ    SN:12    LN:120129022
@SQ    SN:13    LN:120421639
ADD REPLY
0
Entering edit mode

Run

file Sample_het1.Aligned.sortedByCoord.out.bam

If it doesn't say it's a "gzip" file, then it's NOT bam.

ADD REPLY
0
Entering edit mode

Ran the command and its says it is a gzip file

file Sample_het1.Aligned.sortedByCoord.out.bam
Sample_het1.Aligned.sortedByCoord.out.bam: gzip compressed data, extra field
ADD REPLY

Login before adding your answer.

Traffic: 2284 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6