Hi all,
I would appreciate some advice on a problem I have run into. My RNA-seq pipeline is fastp (QC) -> Salmon -> tximeta -> DESeq2 to do gene level DEA. My problem is related to the theme of this article https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5520851/ i.e. protein coding genes within lincRNA. I am interested in differentially expressed lincRNA in my study system, using RNA-seq to discover candidate lincRNA that we then individually follow up with functional studies. If a lincRNA gene has both coding and non-coding splice variants (see for example http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000235387;r=9:35909490-35937153), I only see a single gene entry in my Salmon processed DESeq2 data. I don't know if the protein transcript or the lincRNA transcript is being differentially expressed.
Any ideas on how I can distinguish which splice variants are differentially expressed (padj <0.05, |log2FC| > 2) in my RNA-seq analysis ? Would using a splice-aware aligner like STAR followed by DESeq help ? Thanks for any input and sorry if this has already been asked before !
Thanks Kevin, I will look into your suggestions !