Hello out there.
I was wondering if there is a simple way using R to calculate the coverage of a protein when you have a list of peptides from it and its initial sequence.
For example let's say that we have this protein sequence taken from uniprot:
MAFSAEDVLKEYDRRRRMEALLLSLYYPNDRKLLDYKEWSPPRVQVECPKAPVEWNNPPS
EKGLIVGHFSGIKYKGEKAQASEVDVNKMCCWVSKFKDAMRRYQGIQTCKIPGKVLSDLD
AKIKAYNLTVEGVEGFVRYSRVTKQHVAAFLKELRHSKQYENVNLIHYILTDKRVDIQHL
EKDLVKDFKALVESAHRMRQGHMINVKYILYQLLKKHGHGPDGPDILTVKTGSKGVLYDD
SFRKIYTDLGWKFTPL
and we have a list of some of its peptides that may or may not overlap one an other.
pepts = c("DRRRRMEALLLSLY", "YPNDRKLL", "DYKEWSPPRVQVECPKAPVEWNNPPS
EKGLIVGHFSGIKYKGEKAQA", "SEVDVNK", "MCCWVSKFKDAMRRYQGIQ", "TCKIPGK", "VLSDLD
AKIKAYNLTVEGVEGFVRYSRVTK", "DRRRRMEALLLSLYYPNDRKLL" , "SEVDVNKMCCWVSKFK")
Can we somehow to calculate the coverage ?
Thank you.
While this is not a R solution, have you thought of doing multiple-sequence alignment?
I tried clustal omega but I don't know how to get its results inside R and also it doesn't seem to return a percentage of coverage.
Not my field of work, however I found 2 solutions looking in google. Not tested my end. Try and see if it fits yours.
For MS data : isobar R package does the work, check the pdf
I also found this tool Protein Coverage Summarizer but it's not an R package
Thank you but none of them seem to can help me.