Entering edit mode
3.7 years ago
suresh.wrc
•
0
I am new to the bioinformatics field. I have positive and negative protein sequences for acetylation PTM. Now, I want to train a classifier, say SVM. What will be the next step? How can I convert these sequences into usable features? Any information or links would help.
>P31327_55|1|testing
KAQTAHIVLEDGTKMKGYSFGHPSSVA
>P31327_57|1|testing
QTAHIVLEDGTKMKGYSFGHPSSVAGE
>P31327_119|1|testing
APDTTALDELGLSKYLESNGIKVSGLL
>P31327_157|1|testing
LATKSLGQWLQEEKVPAIYGVDTRMLT
Maybe you can start with extracting features like PseAAC:
https://en.wikipedia.org/wiki/Pseudo_amino_acid_composition