Entering edit mode
6.1 years ago
karthic
▴
130
Hi Everyone,
I have a set of viral coding sequences, I have blasted them against the Uniprot virus database, but could not get any hits. The sequences are very unique, so when I blast against NR next, still very less number of hits and for those which has hits have very low query coverage. So I am having difficulty in finding out the significance of these sequences.
Please suggest if there are any other approaches.
Thanks KK
Do you want to find out what the sequences match to, or just predict coding sequences?
You could try
prokka
with a different translation table for de novo annotation. It iteratively checks HMMs and various other databases to annotate many different features.No. We have already predicted the coding sequences, but unable to get proper annotations, with most of them being uncharacterized proteins. There is no plan to do wet lab studies on them. Am wondering if there are any insilico approaches to know their significance.
And we have used prokka, with not much luck.
Regards, KK
Is this a new or obscure organism?
You will likely have the best luck with HMM based approaches. Perhaps try HHPred or HMMER with low stringency cutoffs (even as low as 20% ID or so).
Unfortunately, your results are only ever going to be as good as the databases, and if its not a well represented organism, you're going to struggle no matter what.
It is not new, but definitely a less studied organism (virus)
I will try the approaches you have suggested, and will see what we can get with them.
Thank You
KK