Hello, I am relatively new to CummeRbund. Can somebody tell me how to write a tab file containing diffGenes/diffGeneIDs? Thanks G.
Hello, I am relatively new to CummeRbund. Can somebody tell me how to write a tab file containing diffGenes/diffGeneIDs? Thanks G.
Using CummeRbund:
diff_genes=subset(diffData(genes(diff_data)),(significant=='yes'))
where diffdata is the initial diffout folder generated after running cuffdiff and read in R using readCufflinks
Now, write out the diff_genes(list of significant DE genes)
write.table(diff_genes,'diff_genes.txt',sep='\t',quote=FALSE,row.names=FALSE,col.names=TRUE)
Using awk in terminal (In case you just need the list freshly out from cuff_diff without any R manipulation)
awk '$14=="yes"' diff_out/gene_exp.diff > diff_genes.txt
where diffout is again the output folder containing results of cuffdiff and geneexp.diff contains the list of genes tested for DE. In most cases the 14th column is the column which says the gene is significantly expressed or not, if you have some other column replace the number 14 by that.
If just interested in number of DE genes, then
awk '$14=="yes"' diff_out/gene_exp.diff | wc -l
Cheers
Sorry,
How I can extract columns 2 and 3 if only the column 14(significant) is yes and only between samples C1 and C2 because I have another samples in lower rows and put the result separately for which column 10 <0 and another folder for which column 10 > 0
Thank you so much
sigGenes=subset(diffData(genes(cuff)),(significant=='yes'))
This gives you sigGenes as a dataframe, you can now subset it to anything you like. I don't understand your question completely, but you can subset it sample names C1/C2 etc by
subGens=subset(sigGenes,sample_1=="C1" & sample_2=="C2")
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Sukhdeep..
Thanks for a wonderful script. It worked for me but the output file include non-significant genes.. What am I doing wrong.. I want the sig genes only.
Please include in your answer how to replace the XLOC thingy with the real gene id. Thanks.
Fahim
Hi Fahim,
Either you should as a new question or add a new comment, don't put these as answers unless what you are writing is a real answer.
You have to use
-g
with an appropriate GTF file to be used to with cufflinks to get gene id's.XLOC are the cufflinks locus id's which are mapped to the locus information in the provided gtf file to fetch the geneids
http://seqanswers.com/forums/showthread.php?t=19079
Cheers