Question

BioGRID REST - How to retrieve all the protein interactions?

0

Entering edit mode

8 months ago

dbykov • 0

Hello! I am using the REST service of BioGRID for fetching the PPI in Homo sapiens, following the instructions from the BioGRID repository and as template this script https://github.com/BioGRID/BIOGRID-REST-EXAMPLES/blob/master/get_interactions_for_pandas.py However, when I check the shape of the DataFrame, it always show me a maximum of 10K interactions, I understand that this is due to the restriction of the REST service, where the number maximum of interactions is 10K. I would like to know is someone had faced the same problem before and how to solve it?

Here is my script:

``

# Define the URL for fetching protein interactions from BioGRID
BIOGRID_URL = "https://webservice.thebiogrid.org/interactions/?"
#BioGRID API key
API_KEY = "myKEY"
# Species ID
SPECIES_ID = "9606"  # Homo sapiens

def fetch_protein_interactions():
    params = {
        "taxId": SPECIES_ID,
        "accesskey": API_KEY,
        "format": "json",
        # "interSpeciesExcluded":"true",
        # "selfInteractionsExcluded":"true",
        # "includeEvidence":"true",
        # "throughputTag":"true"
    }

    try:
        response = requests.get(BIOGRID_URL, params=params)
        response.raise_for_status()
        interactions_data = response.json()
        return interactions_data
    except requests.exceptions.RequestException as e:
        print("Error fetching data:", e)
        return None

def transform_to_dataframe(interactions_data):
    # Extract relevant data fields
    interactions = []
    for interaction_id, interaction_info in interactions_data.items():
        interaction_entry = {
            "BioGrid ID_A":interaction_info["BIOGRID_ID_A"],
            "BioGrid ID_B":interaction_info["BIOGRID_ID_B"],
            "Organism A":interaction_info["ORGANISM_A"],
            "Organism B":interaction_info["ORGANISM_B"],
            "SymbInter_A": interaction_info["OFFICIAL_SYMBOL_A"],
            "SymbInter_B": interaction_info["OFFICIAL_SYMBOL_B"],
            "Gen A":interaction_info["ENTREZ_GENE_A"],
            "Gen B":interaction_info["ENTREZ_GENE_B"],
            "Experimental System": interaction_info["EXPERIMENTAL_SYSTEM"],
            "Experimental System Type": interaction_info["EXPERIMENTAL_SYSTEM_TYPE"],
            "Throughput": interaction_info["THROUGHPUT"],
            "Quantitation": interaction_info["QUANTITATION"],
            "Qualification": interaction_info["QUALIFICATIONS"],
            "Pubmed Author": interaction_info["PUBMED_AUTHOR"]
        }
        interactions.append(interaction_entry)

    # Convert to Pandas DataFrame
    interactions_df = pd.DataFrame(interactions)
    return interactions_df

if __name__ == "__main__":
    interactions_data = fetch_protein_interactions()
    # print(interactions_data)  # Debug: print interactions_data
    if interactions_data:
        interactions_df = transform_to_dataframe(interactions_data)
        print("DataFrame Size:")
        print(interactions_df.shape) # shape of df
        print("Number of Interactions:")
        print(len(interactions_df)) # number of interactions
        # print(interactions_df.head(10)) # dataframe head

BioGRID Python REST • 544 views

ADD COMMENT • link updated 8 months ago by Pierre Lindenbaum 164k • written 8 months ago by dbykov • 0

0

Entering edit mode

how about 'just' using the XML dump ? https://wiki.thebiogrid.org/doku.php/psi-mi_xml_version_2.5

ADD REPLY • link 8 months ago by Pierre Lindenbaum 164k

0

Entering edit mode

Thank you for your answer. It is a good option, however, dealing with such files for further filtering it is a more complicated task in my opinion, or at least I have not be able to do it, that's why I was looking for the REST service option.

ADD REPLY • link 8 months ago by dbykov • 0

score 1 · Answer 1 · 2024-03-29

yeaaaaaaaaaaaars ago i wrote a XSLT stylesheet to convert a psi-mi.xml file into a sqlite3 database. I'm not sure it still works but you'll get the idea about how to convert such xml file into another format.

xsltproc -o tmp.sql psi2sql.xslt BIOGRID-ALL-3.4.129.psi.xml
sqlite3 db.sqlite3 < tmp.sql

https://github.com/lindenb/xslt-sandbox/blob/master/stylesheets/bio/psi/psi2sql.xslt