Entering edit mode
4.4 years ago
fullmooninu
•
0
I'm trying to figure out a way to find subsequences of virus inside a eukaryotic genome.
I'm not sure how to do this, I was thinking I should blastn all subsequences of the eukaryotic and see if they match a virus.
Any help would be greatly appreciated.
if you have the sequence of the virus you are looking for you will be better of blasting that sequence to your eukaryotic genome (in stead of vice versa)
Can't think of any right now but there should be tools around that specifically do this (check for viral sequences)
If you are searching for Endogenous Virus Elements, you can use any blast analysis that compares proteins, some studies use blastx with the genome as query (translating all genomic regions to proteins) against a viral protein database.
If you will use the genome as a database (and don't use tblastx), keep in mind that nucleotidic comparisons are, in general, less sensitive that aminoacid comparisons, and if you are working with EVEs, the sensibility of your search is a key factor.
Other strategies that can be used are psi-blast and HMM patterns (but its really hard to do this with a large diversity of viruses once you need HMM files for each protein for each viral group that you are looking for....).