Question

Q: Extracting dibasic PC Cleavage sites

0

Entering edit mode

9.9 years ago

ddofer ▴ 30

I want to get a dataset of all known, validated cleavage sites, of prepro-hormone/protein precursors (e.g. insulin, neuropeptide precursors..), cleaved by PCs (e.g. Furin..).

I'm looking for only annotated or verified sites, so I can't just extract a window from any sequence containing a [KR][KR].

Uniprot has the keyword for "dibasic cleavage", but I don't know if it uses this on pro-hormones that have slightly different (non-canonical) cleavage patterns (which is what I'm interested in).

I though of using the search criter for "polypeptides", and looking in the sequence annotations for "gaps", but that approach is problematic. (Some sequences have the degraded cleavage site in between the polypeptide and a peptide or chain, but not always. I don't mind filtering them out in advance, but I don't know what to filer for).

So - how to get a good, large dataset of dibasic cleavage locations on prepropolypeptides?

(I am aware of cutDB and MEROPS, but I've never worked with them before, and don't know how to download and extract cleavage sites. The datasets of ProP and NeuroPred are out of date or very small and buggy).

Tips on how to easily get the cleavage sites (and location) would also be great - what's the easiest format to use when downloading from uniprot? (And how to parse it for the cleavage location on the sequence..).

Thanks!

sequence Uniprot PTM Cleavage furin • 2.0k views

ADD COMMENT • link updated 2.7 years ago by Ram 44k • written 9.9 years ago by ddofer ▴ 30

Ram · Answer 1 · 2015-01-09

What do you mean by "validated"? If you refer to experimentally proven propeptide cleavage site, you could try this query: http://www.uniprot.org/uniprot/?query=annotation%3A%28type%3Apropep+evidence%3Aexperimental%29&sort=score

or, with the additional keyword "hormone":

http://www.uniprot.org/uniprot/?query=annotation%3A%28type%3Apropep+evidence%3Aexperimental%29+AND+keyword:KW-0372&sort=score

And this query for hormones with experimentally proven signal sequence:

http://www.uniprot.org/uniprot/?query=annotation%3A%28type%3Asignal+evidence%3Aexperimental%29+keyword%3AHormone&sort=score

The definition of the UniProt keyword "Cleavage on pair of basic residues" is

Protein which is posttranslationally modified by the cleavage on at least one pair of basic residues, in order to release one or more mature active peptides (such as hormones).

(http://www.uniprot.org/keywords/KW-0165)

and this query returns all UniProtKB entries that were annotated with this keyword:

http://www.uniprot.org/uniprot/?query=keyword:KW-0165

You may want to explore using the gff format for your downloads.