Entering edit mode
7.4 years ago
jh
▴
40
Hi,
With Salmon, reads are mapped against the transcriptome and not the genome. Is it possible to use an extracted transcriptome from the human genome prepared for RSEM?
Thanks!
Rob: Which transcriptome reference file do you recommend I build the index from? And where can I get such files? :) I want to analyze two different outputs: (1) isoforms of known genes in the human genome, and (1) total transcript abundance of known genes in the human genome with HGNC gene names.
I generally like the GenCode reference, since it's fairly comprehensive and you can get the sequences directly (rather than having to build them from the genome + gtf). You can also get just the protein-coding transcripts if you want. I generally shy away from e.g. the complete Ensembl transcriptome since it includes many exact duplicate sequences (with different names), that should generally be filtered out or collapsed before quantification.
Thank you so much!! :D
Some papers used RSEM for transposable elements/repeats quantification. Does this mean I can use the rsem-prepare-reference of the repeat library, and feed it into Salmon?
Users have had success using salmon for such quantification tasks before. So, if you provide it with the proper reference, yes; it should work as well as RSEM for this task.
Hi, I have a question: does the result consist of predicted CDS? Because I was wondering whether you can use predicted CDS instead of e.g. assembled transcripts with Salmon. I encountered some chimeric transcripts during de novo assembly, and expect a large amount of collapsed homeologs, too (polyploid species). Transdecoder was able to disentangle concatenated CDS.