Hello All,
I have used TUXEDO pipeline for RNAseq differential gene expression analysis. For that purpose I used cuff-merge-generated annotation (reference based transcriptome assembly) which contains both coding and non-coding genes (~60000 genes). However my prep is polyA enriched (mRNA selection). My judgments are based on multiple comparisons (FRD) corrected P-values (i.e. Q-values). The thing that concerns me a little bit is that number of comparisons (all genes) is far grater than the proportion that is in my library (mRNA only) which I thought may unnecessarily skew my Q-values? I was considering using a GTF with only mRNA annotation. Is this a valid concern? I know that cuffdiff doesn't do differential analysis for regions with no coverage (in both conditions), when it states 'NO TEST'. Are these taken away from FDR calculation algorithm? In other words is the FDR correction exclusively based on the number of comparisons that were performed?
Thanks
I didn't mean repeating transcriptome assembly on simplified GTF - only wanted to know if NO TEST genes were considered in multiple comparisons...
If that is the case, you have two simple way to check
p.adjust
in R can do the job for you)p.adjust
looks rather handy. Thanks!