Express v_1.5.1 Error: Transcript IDs from MultiFASTA do not match with (SAM/BAM) file.
1
0
Entering edit mode
3.8 years ago
bioinfo89 ▴ 60

Hi All,

I am using express 1.5.1 to calculate the read counts for RNA Seq data. I have removed rRNA contamination from the raw data and I am using the reads to map the reference genome GRCh37. When I run express tool, it throws the following error:

WARNING: Could not connect to update server to verify current version. Please check at the eXpress website (http://bio.math.berkeley.edu/eXpress).
WARNING: Target 'ENST00000335137.3|ENSG00000186092.4|OTTHUMG00000001094.1|OTTHUMT00000003223.1|OR4F5-001|OR4F5|918|CDS:1-918|' exists in MultiFASTA but not alignment (SAM/BAM) file.

The command I am using is as follows:

express gencode.v19.pc_transcripts.fa tophat_out/accepted_hits.rmdup.bam --output-dir . --calc-covar

I understand from the error message that the transcript file I am using for quantification does not match with the alignment BAM file reference information. I am using the same reference genome file here. Does this step require me to map the reads to the transcript fasta before I do the quantification step? Any help would be highly appreciated.

Thank you!

rna-seq read counts • 1.1k views
ADD COMMENT
0
Entering edit mode
3.8 years ago

Hi bioinfo89

The error is invoked from this line in the source code (target.cpp).

Apparently, it is expecting the transcript fasta headers to match with BAM headers. Check if

grep "^> transcripts.fa and samtools view -H your.bam | grep "SQ" match.

ADD COMMENT

Login before adding your answer.

Traffic: 2045 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6