Hi everyone: I am interested in quantifying change in repetitive elements ( LTR here) transcription after treatment and I come up with following ideas:
- Directly map RNA-seq data to genome with hisat2 and quantify with repetitive element annotation from Repeatmasker, followed by collecting elements from the same class to compare them. But I am not sure about how to set up maximum allowed multiple alignment value (For most RNA-seq it requires to be uniquely mapped but the value would be much higher since repetitive elements happens lots of times).
- I got consensus repetitive element sequence fastq from Repbase, is it possible to view these repeat elements as "transcriptome" and use salmon (or similar transcriptome based tools) to map reads on it?
I am not familiar with this area and I would appreciate any suggestions . Thanks for help!
Update: Since I am only interested in LTR, I have modified the question. It looks possible to extract uniquely mapped reads and combine with Repeatmasker annotation. Direct quantification looks like will fail since repetitive elements are abundant in mRNA.
Thanks for your answer. I'm concerned about memory usage by STAR and maybe I will start hisat2 with -k 100 to see if it can be used by TEtranscript tool.
Hi Devon, is this answer still the same, given the Telescope paper? It appears that TEtranscript does not perform well compared to SalmonTE and Telescope. https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006453
The answer hasn't changed just because there are now newer methods that may be somewhat more accurate.