Hello!
I'm concerned that I may be taking the wrong approach to this problem. I'm trying to identify transcripts important for day 0 and day 2 in the differentiation process of a particular cell type. I've assembled and quantified transcripts from RNA-Seq data using STAR and Stringtie. So, now I have data that can be described pictorially as in the attached picture. (This is a simplified description) In the picture, the green rectangles are genes, and the squiggles below them are their respective transcripts and count information.
Data representation - > https://imgur.com/gallery/bPFioZm
Now, keep in mind I'm a beginner. So, I may be wrong on some of the intricacies! Anyway, here is my understanding of the following analysis strategies.
Differential Gene Expression (DGE) analysis tells us that genes 1 and 3 were differentially expressed between day 0 and day 2. A popular tool for this kind of analysis is DESeq.
Differential Transcript Usage (DTU) analysis tells us that on day 2 in gene 3, transcript E was differentially expressed. A popular tool for this kind of analysis is DRIMSeq.
Differential Transcript Expression (DTE) analysis tells us that transcripts A,B,C, and E were differentially expressed between day 0 and day 2.
I am currently using DESeq and DRIMSeq to examine DGE and DTU in my data by following the workflow from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6178912/ However, I'm wondering if I really just need to examine DTE in my data to answer the question "What transcripts are important for the differentiation process between day 0 and day 2?" so my question is: does DGE and DTU combined provide the same information as DTE? If not, what tool can I use to analyze DTE in my data?
Thanks in advance for the help.
EDIT: to be more specific about my goal I have two aims:
- I'm looking for changes in lncRNA expression from day 0 to day 2 of adipogenesis
- and also looking for novel lncRNA transcripts.
I'm predicting that there are transcript level changes (lncRNAs) that are driving my process. For that reason I use de novo assembly with guidance from an annotation. Next, filter out any transcripts that arent lncRNAs by comparing to Gencode lncRNA GTF. Then I quantify the filtered data with stringtie -e The point I'm currently at is analyzing the quantification data for differential expression.
Hey, thank you very much for the response! I'm working with mouse RNA-Seq data, so I do have access to a transcriptome. I went and updated my question with more details to make my goal more clear.
The type of DE I should use is confusing me. I've learned I can use DESeq2 which tells me which genes the lncRNAs originate from have been DE. But I am also curious about the specific isoforms. DRIMSeq tells me information on DTU, and I believe this gives me useful information. But is there a better way to determine which transcripts are DE between days