Entering edit mode
5.8 years ago
kwicher
▴
10
Hello
What would be the best tool to map short read RNAseq data to the reference transcript HMM profiles? One thing, the HMM profiles could possibly describe only a part of the actual transcripts.
Thanks
K
Thanks all.
RNAseq data would probably everything from 35bp to 250bp reads and the HMM profiles are predominantly more than 1kb, thus direct mapping does not work.
I was thinking about doing something like that; first using the RNAseq data do de novo transcripts assembly and then using HMMER to identify the transcripts which contain particular HMM profiles.
K
Why would direct alignment not work?
As I said, I would not have a reference sequence to map to, only a HMM profile of the sequences.
Can you elaborate on what you mean by the 'reference transcript HMM profiles'?
The idea is that I would not map to the reference genome but tot a set of HMM profiles (describing the nucleotide sequences).
Why would you want to do that, as opposed to mapping it to the actual transcriptome?
Yes, it is a good question and I cannot really go into the details but in general, I would not know the exact transcriptome but I would only have the HMM profiles based on the homologous sequences that were used to create the HMM profile.
Thanks
K
Do you have access to the homologous sequences themselves? It'd be simple enough to just align to them. With an HMM I suspect you'll have to invent a scoring method for assessing the quality of an "alignment" (really, the probability of a given HMM producing an observed sequence compared to all other HMMs producing said sequence).
No I do not have access to that.
How about searching for the HMM profiles in the reads directly?
These are 10s of millions short reads RNA-seq data and HMM profiles are several 100s bp long
Would that work?