Hi all, I am trying to perform differential gene expression analysis between two conditions. Primarily, I am interested in the lncRNAs that are differentially expressed. My workflow is: Salmon--> Txtimport--> DEseq2. I have done the analysis using two different annotations (both Gencode v22): comprehensive gene annotation (all the genes including lncRNAs) and long non coding RNA gene annotation (lncRNAs only). The RNA-seq library preparation method is of unstranded type, so it would be a problem to unequivocally determine the expression of lncRNAs that are antisense to protein-coding genes. I have two questions:
If I am primarily interested in lncRNAs that are differentially expressed, would it matter if I do the analysis using comprehensive gene annotation (all the genes) and long non coding RNA gene annotation (lncRNA genes only)?
As the analysis is for unstranded, shall I just not consider the antisense lncRNA genes after the differential expression analysis? Would it be better to start the analysis with an annotation that contains only long intergenic RNAs (lincRNAs) that do not overlap with any protein coding genes? I am not sure how to modify the gencode annotation to do the latter.
I would appreciate any help. Thanks in advance!
Thanks for your comments. Very helpful!