i have got approx 2500 lncrna and want to find out the differentially expressed genes. I fetched the data for the lncrna from gene_exp.diff. now some of the fpkm values in both control and stress are 0. I have read in a paper that first normalize fpkm values by adding 0.0001 then calculate foldchange and for differentially expressed genes proceed as
upregulated: fold change>=2 and p value <=0.05
downregulated:fold change<=0.5 and p value <=0.05
yet in another paper I read that first filter out fpkm >=0.1 in any tissue.
then after filtering proceed with adding 0.0001 to fpkm and then calculate upregulated and downregulated.
my question: which way to proceed and what is the difference between the two?
I can't really tell given the information you have- I am guessing you use Cufflinks. However, FPKM, RPKM and others should always be taken with a pinch of salt. You need to know what tools were used to align the transcripts, and how the counting process was done. It would also help if you could post what samples you have, and what conditions you were testing (different tissues, times series, different treatments?).
The logic is that a few reads aligned to a gene don't really mean anything (it is the law of high numbers - a better coverage/sequencing depth means a better approximation of the 'real' expression).
Typically, differentially expressed genes are represented as a MA plot: the expression level vs the fold change. If a gene is well expressed and changes a lot, it is a good candidate. Otherwise, you can't conclude.
Yes, cufflinks has been used. The samples are 3 rice cultivars along with the conditions control, dessication and salinity. So how can I proceed in such a case?