How to find the overlapping peptide from a list.
1
0
Entering edit mode
5.1 years ago

I have the following sequence;

IRCIGVSNRDFVEGMSGGTWVDVVLEHGGCVTVMAQDKPTVDIELVTTTVSNMAEVRSYCYEASISDMASDSRCPTQGEAYLDKQSDTQYVCKRTLVDRGWG NGCGLFGKGSLVTCAKFACSKKMTGKSIQPENLEYRIMLSVHGSQHSGMI

I used three tools/methods from the Immune Epitope Database (IEDB) to predict antigenic peptides from this sequence. So, each of the method generated a list of peptides. Thus I have three lists for the given sequence. The principle is; I will find the peptide that is common or overlapping in all three lists.

Here in the linked table, I found a peptide DSRCPTQ common and overlapping in all three lists. But I did it manually.

Is it possible to find peptide in such a way through UNIX command in ubuntu terminal? If yes, then would you please suggest me the way?

Predicted Peptides

sequence • 915 views
ADD COMMENT
3
Entering edit mode
5.1 years ago
Mensur Dlakic ★ 28k

You may get some useful ideas from this post. For it to work, you would need to concatenate all peptides from the same predictions into one string, but with space characters separating them:

pept_combo1 GVSNRDFVEGMSGGTWVDVVLEHGGCVTVMAQDKPTVDIELVTTTVSNMAEVRSYCYEASISDMASDSRCPTQGEAYLDKQSDTQYVCKRTLVDRGWGNGCGLFGKGSLVTCAKFACSKKMTGKSIQPENLEYRIMLSVHGSQHS
pept_combo2 MAQDKPTV MASDSRCPTQGEAYLDKQSDT KSIQPENLEYR
pept_combo3 WVDVVLEHGGCVTVM KPTVDIELVTTT VRSYCYEAS DSRCPTQ TQYVCKRTLVDR KGSLVTCAKFACSK RIMLSVHGSQ

For this to work with the accepted python script in a post I referenced above, you would skip the pept_comboX part and only paste the sequences.

def long_substr(data):
    substr = ''
    if len(data) > 1 and len(data[0]) > 0:
        for i in range(len(data[0])):
            for j in range(len(data[0])-i+1):
                if j > len(substr) and is_substr(data[0][i:i+j], data):
                    substr = data[0][i:i+j]
    return substr

def is_substr(find, data):
    if len(data) < 1 and len(find) < 1:
        return False
    for i in range(len(data)):
        if find not in data[i]:
            return False
    return True

print long_substr(['GVSNRDFVEGMSGGTWVDVVLEHGGCVTVMAQDKPTVDIELVTTTVSNMAEVRSYCYEASISDMASDSRCPTQGEAYLDKQSDTQYVCKRTLVDRGWGNGCGLFGKGSLVTCAKFACSKKMTGKSIQPENLEYRIMLSVHGSQHS',
                   'MAQDKPTV MASDSRCPTQGEAYLDKQSDT KSIQPENLEYR',
                   'WVDVVLEHGGCVTVM KPTVDIELVTTT VRSYCYEAS DSRCPTQ TQYVCKRTLVDR KGSLVTCAKFACSK RIMLSVHGSQ'])

This script will print out DSRCPTQ.

ADD COMMENT
0
Entering edit mode

It is working nicely. Thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 2503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6