Cuffcompare: Unable To Map Reference Gene Names To Cufflinks Output
1
1
Entering edit mode
10.7 years ago

I ran cufflinks on three bacterial RNA-seq samples, and want to use cuffcompare to get the "union" of the transcripts and to map the transcripts to the annotated reference genome. I dod get the unioned list of transcripts (which I will use with cuff diff to look for differentially expressed transcripts) but the genes are not annotated. Can anyone suggest what the issue might be?

tl;dr: My cuffcomapre output has XLOCXXXXXX as gene ids instead of the reference gene name from the annotation file.

The command I have tried is

cuffcompare  -r 'genome.gff' 'A2_cuffout/transcripts.gtf' EA349_1cuffout/transcripts.gtf' 'EA349_2cuffout/transcripts.gtf' 

The genome.gff file looks like the following:

head genome.gff > 
##gff-version 3
#!gff-spec-version 1.20
#!processor NCBI annotwriter
##sequence-region NC_002505.1 1 2961149
##species http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=243277
NC_002505.1    RefSeq    region    1    2961149    .    +    .    ID=id0;Dbxref=taxon:243277;Is_circular=true;biotype=El Tor;chromosome=I;gbkey=Src;genome=chromosome;mol_type=genomic DNA;old-name=Vibrio cholerae O1 biovar eltor str. N16961;serotype=O1;strain=N16961
NC_002505.1    RefSeq    gene    235    402    .    -    .    ID=gene0;Name=VC0001;Dbxref=GeneID:2614109;gbkey=Gene;locus_tag=VC0001
...

And the transcripts.gtf files look like:

head 'A2_cuffout/transcripts.gtf >
gi|15600771|ref|NC_002506.1|    Cufflinks    transcript    286    994    1000    .    .    gene_id "CUFF.1"; transcript_id "CUFF.1.1"; FPKM "4.5531694820"; frac "1.000000"; conf_lo "2.465004"; conf_hi "4.663521"; cov "19.278874";
gi|15600771|ref|NC_002506.1|    Cufflinks    exon    286    994    1000    .    .    gene_id "CUFF.1"; transcript_id "CUFF.1.1"; exon_number "1"; FPKM "4.5531694820"; frac "1.000000"; conf_lo "2.465004"; conf_hi "4.663521"; cov "19.278874";
...

my outputted combined.gtf file looks like:

gi|15600771|ref|NC_002506.1|    Cufflinks    exon    10    4310    .    .    .    gene_id "XLOC_000001"; transcript_id "TCONS_00000457"; exon_number "1"; oId "CUFF.1.1"; class_code "."; tss_id "TSS1";
gi|15600771|ref|NC_002506.1|    Cufflinks    exon    4500    8484    .    .    .    gene_id "XLOC_000002"; transcript_id "TCONS_00000005"; exon_number "1"; oId "CUFF.5.1"; class_code "u"; tss_id "TSS6";

So the gene id is XLOC_000001 instead of VC0001 or something similar.

cufflinks gff • 4.7k views
ADD COMMENT
1
Entering edit mode
10.4 years ago
madkitty ▴ 690

I see a ' missing in your command, see highlighted part

cuffcompare  -r 'genome.gff' 'A2_cuffout/transcripts.gtf' EA349_1cuffout/transcripts.gtf' 'EA349_2cuffout/transcripts.gtf'
__________________________________________________________^_____________________________^

maybe it couldn't read properly your reference file ..

ADD COMMENT

Login before adding your answer.

Traffic: 2575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6