What percent identity is recommended for detecting protein homology within a species?
0
0
Entering edit mode
2.4 years ago
O.rka ▴ 740

I have species X with established gene models. I have a de novo assembly of a different strain of species X where I performed gene calls. I want to figure out which genes are unique in the strain compared to the reference assembly.

I planned on using MMSEQS2 to cluster both gene sets but I need to pick a good percent identity. What percent identity cutoff should I use?

alignment genomics proteomics • 778 views
ADD COMMENT
0
Entering edit mode

If it is a strain of the same species then should the cutoff not be set as high as possible to start with.

Are you hoping to see unique genes from each strain not cluster (trying to avoid pairwise comparisons)?

ADD REPLY
0
Entering edit mode

Yes, I would want a unique gene to not cluster. Do you think 95% is too low?

ADD REPLY

Login before adding your answer.

Traffic: 2313 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6