Question

Annotate coding sequences

0

Entering edit mode

6.1 years ago

karthic ▴ 130

Hi Everyone,

I have a set of viral coding sequences, I have blasted them against the Uniprot virus database, but could not get any hits. The sequences are very unique, so when I blast against NR next, still very less number of hits and for those which has hits have very low query coverage. So I am having difficulty in finding out the significance of these sequences.

Please suggest if there are any other approaches.

Thanks KK

annotation sequence analysis • 1.4k views

ADD COMMENT • link 6.1 years ago by karthic ▴ 130

0

Entering edit mode

Do you want to find out what the sequences match to, or just predict coding sequences?

You could try prokka with a different translation table for de novo annotation. It iteratively checks HMMs and various other databases to annotate many different features.

ADD REPLY • link 6.1 years ago by Joe 21k

0

Entering edit mode

No. We have already predicted the coding sequences, but unable to get proper annotations, with most of them being uncharacterized proteins. There is no plan to do wet lab studies on them. Am wondering if there are any insilico approaches to know their significance.

And we have used prokka, with not much luck.

Regards, KK

ADD REPLY • link 6.1 years ago by karthic ▴ 130

0

Entering edit mode

Is this a new or obscure organism?

You will likely have the best luck with HMM based approaches. Perhaps try HHPred or HMMER with low stringency cutoffs (even as low as 20% ID or so).

Unfortunately, your results are only ever going to be as good as the databases, and if its not a well represented organism, you're going to struggle no matter what.

ADD REPLY • link 6.1 years ago by Joe 21k

0

Entering edit mode

It is not new, but definitely a less studied organism (virus)

I will try the approaches you have suggested, and will see what we can get with them.

Thank You

KK

ADD REPLY • link 6.1 years ago by karthic ▴ 130