Entering edit mode
7 weeks ago
ZuelTech
•
0
Hi,
I have RNA seq libraries from a non-model species. Is it okay to use a different reference de novo Transcriptome assembly of the same species constructed from different RNA-seq libraries? Because usually, de novo transcriptome assemblies are being generated from the same RNA seq libraries. Thanks!
For quantitation? It should be fine if it is the same species.
You could also assemble your own data to make a de novo transcriptome.
Thanks for your reply. Is this technique (the one I mentioned in my question) a common practice? Is there a need to check for robustness of this technique?
People always use the transcriptomes/genomes available for many organisms (model and non-model) from public databases like NCBI/Ensembl. Reference genome databases exist for that reason.
That said, if you were working with a dataset the is special/specific then it is possible that transcripts that may be expressed under those conditions may not be present in a public reference dataset. For that reason aligning to genome and then quantitating is also a valid option to discover novel transcripts.
Thanks! I'm actually using a reference de novo transcriptome which we assembled. Do you mean I need to align our RNA seq data set to the de novo assembled transcriptome? What data or answers can I get in this method?
If you assembled the transcriptome then you can use it with
salmon
to estimate gene expression. Was that not the aim of your experiment in first place?Thanks! Yes, I will eventually perform quantification and differential expression analysis. What I mean for my question is, why do I have to align my RNA seq data to the de novo assembled transcriptome?
Otherwise how are you going to translate the pile of reads you have into counts that are required for the DE analysis.
If you assembled a
de novo
transcriptome, it is going to be just a set of sequences. There will be no annotation associated with them. There may be redundancies (unless you usedcd-hit
or some such package to remove those). So to make any logical sense of the DE results you are going to need to spend additional time annotating the transcripts that you assembled.That makes sense. Thanks. I'm actually concerned about possible redundancy of my transcriptome assembly. Is there a way to check the number of redundant sequences and polymorphisms?
As I had noted above use https://sites.google.com/view/cd-hit/home
Thanks for this!