Why is cuffdiff only returning insignificant differences?
2
0
Entering edit mode
10.7 years ago

I actually have two problems.

  1. How can I associate the gene_id field in the gene_exp.diff to gene names, like P53.
  2. The "significant" column contains all "no"s. Did I mess a step up?
rna-seq cuffdiff • 2.7k views
ADD COMMENT
0
Entering edit mode

I took the liberty of rewriting your post for readability.

ADD REPLY
0
Entering edit mode

thank you for your help.

I download the original annotation file like annotation.gff

like

chr27   GLEAN   mRNA    22911205        22948623        0.764683        +       .       ID=goat_GLEAN_10016511;
chr27   GLEAN   CDS     22911205        22911236        .       +       0       Parent=goat_GLEAN_10016511;
chr27   GLEAN   CDS     22912697        22912806        .       +       1       Parent=goat_GLEAN_10016511;
chr27   GLEAN   CDS     22947595        22947791        .       +       2       Parent=goat_GLEAN_10016511;
chr27   GLEAN   CDS     22948513        22948623        .       +       0       Parent=goat_GLEAN_10016511;

then I use

gffread -E annotation.gff -T -o- >annotation1.gff

like

chr1    GeneWise        CDS     53005   53166   .       -       0       transcript_id "GOAT_ENSBTAP00000002299";
chr1    GeneWise        CDS     63550   63731   .       -       2       transcript_id "GOAT_ENSBTAP00000002299";
chr1    GeneWise        CDS     64892   64998   .       -       1       transcript_id "GOAT_ENSBTAP00000002299";
chr1    GeneWise        CDS     65581   65706   .       -       1       transcript_id "GOAT_ENSBTAP00000002299";
chr1    GeneWise        CDS     67611   67720   .       -       0       transcript_id "GOAT_ENSBTAP00000002299";

I use the command

cufflinks -g annotation.gff

to obtain the last results like

gene_exp.diff

test_id gene_id gene    locus   sample_1        sample_2        status  value_1 value_2 log2(fold_change)       test_stat       p_value q_value significant
XLOC_000001     XLOC_000001     -       C62390988:0-100 G5      G10     OK      160625  392692  1.2897  10319.5 0.45145 0.865924        no
XLOC_000002     XLOC_000002     -       C62570928:0-99  G5      G10     OK      146434  302521  1.04679 8898.42 0.5856  0.872979        no
XLOC_000003     XLOC_000003     -       C62612218:0-103 G5      G10     OK      152279  200488  0.396804        2210    0.81415 0.930121        no
XLOC_000004     XLOC_000004     -       C62642842:4-99  G5      G10     OK      471867  143846  -1.71386        -22054.5        0.5161  0.870248

isoform_exp.diff

test_id gene_id gene    locus   sample_1        sample_2        status  value_1 value_2 log2(fold_change)       test_stat       p_value q_value significant
TCONS_00000001  XLOC_000001     -       C62390988:0-100 G5      G10     OK      160625  392692  1.2897  10319.5 0.45145 0.868428        no
TCONS_00000002  XLOC_000002     -       C62570928:0-99  G5      G10     OK      146434  302521  1.04679 8898.42 0.5856  0.877033        no
TCONS_00000003  XLOC_000003     -       C62612218:0-103 G5      G10     OK      152279  200488  0.396804        2210    0.81415 0.93174 no

next I don't know how to do. look forward for your help thank you

ADD REPLY
0
Entering edit mode

I don't see any gene_id fields in your annotation...

ADD REPLY
0
Entering edit mode

I did't have replicates . just have G5 G10 two conditions .generally how the gene_id field show in the annotation

chr1    GeneWise        CDS     53005   53166   .       -       0       transcript_id "GOAT_ENSBTAP00000002299";
chr1    GeneWise        CDS     63550   63731   .       -       2       transcript_id "GOAT_ENSBTAP00000002299";
chr1    GeneWise        CDS     64892   64998   .       -       1       transcript_id "GOAT_ENSBTAP00000002299";
chr1    GeneWise        CDS     65581   65706   .       -       1       transcript_id "GOAT_ENSBTAP00000002299";
chr1    GeneWise        CDS     67611   67720   .       -       0       transcript_id "GOAT_ENSBTAP00000002299";

this is the annotation file

ADD REPLY
0
Entering edit mode

Firstly, please add these as comments, not answers. Secondly, as I already mentioned, you have no gene_id information for the entries (as an aside, you want exons, not CDS).

ADD REPLY
3
Entering edit mode
10.7 years ago
  1. Normally one has an annotation file (GTF or GFF) and then supplies that with the -g or -G options. Assuming that it's formatted in a way that cufflinks understands, this should give you more coherent gene IDs.
  2. Perhaps there are simply no significant differences. Realistically, we would need vastly more information to determine how expected this might be (e.g., what was the experimental design, how man samples per-condition, what organism, how many reads aligned and what percentage aligned to genes, ...).
ADD COMMENT
0
Entering edit mode
10.7 years ago
Rob 6.9k

Also, Cufflinks is extremely conservative in calling DE. You might consider the new Ballgown package (http://biorxiv.org/content/early/2014/03/30/003665), by some of the same authors, to see if some statistically significant differences exist in your data under a more liberal (but still mathematically sound) model.

ADD COMMENT
0
Entering edit mode

thank you very much

ADD REPLY

Login before adding your answer.

Traffic: 1417 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6