Annotating RNA-seq gene ids (differentially expressed)
1
0
Entering edit mode
10.0 years ago
David_emir ▴ 500

Hello All,

I have Differentially expressed gene list from RNA-seq data, I have compared Drug treated v/s Untreated And I have got differentially expressed gene list which I have pasted in the following table.

My question is, when I use R/Bioconductor package "cummeRbund" to inspect the number of genes and transcript that are differentially expressed b/w two sample, I have loaded the results of cuffdiff's analysis , which reports the number of differentially expressed genes. The differentially expressed genes is not annotated (XLOC_000007), is their any way to annotate gene_id with actual gene names?

Control vs Treated Data

Differentially expressed Genes

gene_id         sample_1     sample_2     status     value_1     value_2     log2_fold_change     test_stat     p_value        q_value          significant
XLOC_000007     treated      control      OK         2.35097     1.35961     -0.790057            -3.82466      .00005         9.84048E-005     yes
XLOC_000009     treated      control      OK         5.11344     0.0453474   -6.81713             0.00005       9.84048E-005   9.84048E-005     yes
XLOC_000010     treated      control      OK         8.49633     1.21619     -0.564379            -32.2551      0.00005        9.84048E-005     yes
XLOC_000011     treated      control      OK         100.369     8.49633     -0.564379            0.00005       9.84048E-005   9.84048E-005     yes
XLOC_000012     treated      control      OK         206.885     0.0453474   -6.81713             0.00005       0.00005        9.84048E-005
XLOC_000013     treated      control      OK         9.63649     1.21619     -0.564379            0.00005       9.84048E-005   9.84048E-005     yes
XLOC_000017     treated      control      OK         18.764      8.49633     -0.564379            -32.2551      0.00005        9.84048E-005     yes
XLOC_000018     treated      control      OK         0.878346    0.0453474   -6.81713             0.00005       9.84048E-005   9.84048E-005     yes
XLOC_000019     control      control      OK         1.21619     1.21619     -0.564379            0.00005       0.00005        9.84048E-005
XLOC_000019     treated      control      OK         1.21619     0.0453474   -6.81713             -32.2551      9.84048E-005   9.84048E-005     yes

Thanks a Lot ...

-Ateeq Khaliq

RNA-Seq • 4.1k views
ADD COMMENT
1
Entering edit mode

Did you include a GFF/GTF file in the Cuffdiff run? the XLOC ids are that appear if gene annotation is missing..

ADD REPLY
0
Entering edit mode

Hi Roy, yes I have included the file and it looks the following, it has got gen name with Id but couldn't get the same if I am running cummeRbund... I used the following command

> gene_diff_data <- diffData(genes(cuff_data))
> sig_gene_data <- subset(gene_diff_data, (significant == 'yes'))

My cuffdiff output looks like this

est_id          gene_id         gene
XLOC_000001     XLOC_000001     DDX11L1
XLOC_000002     XLOC_000002     OR4F5
XLOC_000003     XLOC_000003     LOC100132062,LOC100133331
ADD REPLY
0
Entering edit mode

Did you try to remove the first two columns is the cuffdiff output and then run cummRband?

ADD REPLY
0
Entering edit mode

Yes i tried doing that, but no change, its showing the same !!! :(

ADD REPLY
1
Entering edit mode
10.0 years ago

Use the annotation file (GTF) while running the cuffdiff. If you have multiple GTF files, e.g one from cufflinks and other from standard annotation, use cuffmerge to generate the final version of GTF and use it while running cuffdiff.

If you use GTF file, the gene names will be printed to one of the columns in the final output.

e.g: Cuffdiff output ran with GTF file

test_id gene_id gene    locus   sample_1        sample_2        status  value_1 value_2 log2(fold_change)       test_stat       p_value q_value significant
ENSMUST00000000001      ENSMUSG00000000001      Gnai3   3:107910197-107949064   SB      AB      OK      14.2736 14.5269 0.0253732       0.112543        0.8671  0.999385        no
ENSMUST00000000003      ENSMUSG00000000003      Pbsn    X:75083239-75098962     SB      AB      NOTEST  0       0       0       0       1       1       no
ENSMUST00000000010      ENSMUSG00000020875      Hoxb9   11:96132770-96137909    SB      AB      NOTEST  0       0       0       0       1       1       no
ENSMUST00000000028      ENSMUSG00000000028      Cdc45   16:18780539-18835354    SB      AB      OK      0.872629        0.902901        0.0492005       0.0316144       0.9662  0.999385        no
ADD COMMENT

Login before adding your answer.

Traffic: 1575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6