Extracting Protein Sequences
1
1
Entering edit mode
13.2 years ago
Hari ▴ 280

Hi, I am trying to extract all the protein sequences of Rho Gef in humans,i need to avoid redundancy and also extract all the sequences.How do i achieve this? I have already tried it with Uniprot but I how can i check if all the sequences are complete,valid and also non-redundant.Is there a way to do it automatically?

Thank you very much

protein sequence extraction • 2.3k views
ADD COMMENT
1
Entering edit mode
13.2 years ago
Chris ★ 1.6k

As for 'non-redundant', this depends on your desired level of redundancy. One measure is sequence identity, i.e. throw away each sequence that already has a homolgue (as defined by a identity threshold) in the set. CD-HIT is one of the most popular tools.

ADD COMMENT
0
Entering edit mode

Thanks for your response,how to do i know if i have all the sequences of Rho gef ? by redundancy i mean i want all the sequences in human but not homologs or repeats but at the same time not exclude any other sequence..

ADD REPLY
0
Entering edit mode

Hi CD hit is really useful,I download all the given protein sequences for RhoGef and cluster them in groups and the non similar or irrevelant proteins are filtered out by this process.Is this correct approach?

ADD REPLY

Login before adding your answer.

Traffic: 2305 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6