Question

Identify gene from sequence at scale in non-model organisms

0

Entering edit mode

11 days ago

matthewmerkin32 • 0

I have some fasta files containing thousands of sequences (both cDNA and amino acid) for multiple species of non-model lepidopterans.

I now want to identify the names or some other identifier (eg uniprot id) that I can later use to find the most common gene ontology (GO) terms.

So far, I have been able to apply this at a small scale using tblastn and looking at the names of the hits in other model species (Drosophila) where the genes have been identified. However, this method is not scalable at all-> even in command line blast as I have to manually look at the hits to find those with a usable name rather than "PREDICTED: species uncharacterised mRNA".

Does anyone have any suggestions on how to identify my genes? Any help would be very much appreciated.

sequence gene • 399 views

ADD COMMENT • link updated 8 days ago by Ram 45k • written 11 days ago by matthewmerkin32 • 0

score 2 · Accepted Answer · 2025-04-13

2

Entering edit mode

10 days ago

shelkmike ★ 1.5k

I don't understand why you think that you need some gene names to perform GO annotation. Just upload your proteins to PANNZER. It is an online tool that does GO annotation. Also, it will give your proteins descriptions like "UDP-glucose 6-dehydrogenase".

ADD COMMENT • link 10 days ago by shelkmike ★ 1.5k

0

Entering edit mode

Thanks for the suggestion, I had not heard of PANNZER. However, when I click on the link given in their paper (http://ekhidna2.biocenter.helsinki.fi/sanspanz/), it won't load and the web page times out. Is this just a temporary outage or is there another way to access it?

ADD REPLY • link 9 days ago by matthewmerkin32 • 0

1

Entering edit mode

I've just tried the link and it works.

ADD REPLY • link 9 days ago by shelkmike ★ 1.5k

0

Entering edit mode

Yep, it was just down earlier but I managed to get the local installation to work anyway, so thanks for the great suggestion.

ADD REPLY • link 9 days ago by matthewmerkin32 • 0