Hello,
I am trying to find enzymes that contain two stretches of amino acids (SLTK) and (TGH), the first is highly conserved and belongs to a family of serine proteases. I have identified several ways of going about this, but I can't figure out how to actually perform it.
Option 1: Blast the two amino acid sequences, but somehow designate them as non-contiguous.
Option 2: Blast the first sequence (SLTK), then perform a second blast with those results for proteins containing TGH.
Option 3: Blast serine proteases for proteins containing TGH.
I feel like this should be a fairly easy thing to do...maybe it is, I don't have too much experience with bioinformatics
Any help is appreciated
Best, Michael
Find where? Do you have an assembled transcriptome, or predicted genes from a draft genome? Or do you want to find which proteins from a large database (like Uniprot?) contain the two domains?
Isn't this a motif finding problem ? Why do you need to use Blast for this ? Or am I missing some information !