Is there a way to search all human or mouse protein sequences for a short (~5 amino acids) degenerate amino acid sequence?
Is there a way to search all human or mouse protein sequences for a short (~5 amino acids) degenerate amino acid sequence?
You can do this easily by installing the FAST: Fast Analysis of Sequences Toolbox (publication)(github).
You can install it by using this command (only use sudo if you need to):
(sudo) perl -MCPAN -e 'install FAST'
You will also need the fasta files containing human and mouse proteins. You can get these from NCBI.
Lets say that the sequences you are looking for are ASNNF
, ASNLF
, or ASNKF
. To search for this you can use fasgrep which is a utility of FAST:
fasgrep -is "ASN[NLK]F" <fasta-file-proteins>
This will return sequences that have either ASNNF
, ASNLF
, or ASNKF
.
Great, thanks! Also found a website that can do it: http://www.genome.jp/tools/motif/MOTIF2.html
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What is the degenerate amino acid sequence?