parsing kegg page
2
0
Entering edit mode
7.4 years ago
Quak ▴ 520

I was wondering how I can parse, a page on kegg like this, given the "fpr:FP2_11210"

http://www.kegg.jp/dbget-bin/www_bget?fpr:FP2_11210

I already looked into two Bioconductor packages "pathway" and "Kegg Graph" but didn't find something relevant. I was hoping somebody already might have some script in his toolbox. Thank you

kegg • 2.2k views
ADD COMMENT
0
Entering edit mode

Do you know about KEGG REST API?

What exactly you want to do? The page you linked doesn't seem to be intended as parseable, rather, it is meant for human consumption.

ADD REPLY
0
Entering edit mode
7.4 years ago

use TOGOWS: http://togows.dbcls.jp/

$ wget -q -O - "http://togows.org/entry/kegg-genes/fpr:FP2_11210.json" | python -m json.tool
[
    {
        "aafasta": ">FP2_11210 (GenBank) hypothetical protein\nMDRKLYPTAQKRHQNSKILFKSFPFYITQKVSKMQGFAQKISNKFGDLPDGIDVKFWYNSLAFTEKCLSYDKQSNDKTAVLRASGYHLGGKHA",
        "aalen": 93,
        "aaseq": "MDRKLYPTAQKRHQNSKILFKSFPFYITQKVSKMQGFAQKISNKFGDLPDGIDVKFWYNSLAFTEKCLSYDKQSNDKTAVLRASGYHLGGKHA",
        "chromosome": null,
        "classes": [],
        "codon_usage": {
            "aaa": null,
            "aac": null,
            "aag": null,
            "aat": null,
            "aca": null,
            "acc": null,
            "acg": null,
            "act": null,
            "aga": null,
            "agc": null,
            "agg": null,
            "agt": null,
            "ata": null,
            "atc": null,
            "atg": null,
            "att": null,
            "caa": null,
            "cac": null,
            "cag": null,
            "cat": null,
            "cca": null,
            "ccc": null,
            "ccg": null,
            "cct": null,
            "cga": null,
            "cgc": null,
            "cgg": null,
            "cgt": null,
            "cta": null,
            "ctc": null,
            "ctg": null,
            "ctt": null,
            "gaa": null,
            "gac": null,
            "gag": null,
            "gat": null,
            "gca": null,
            "gcc": null,
            "gcg": null,
            "gct": null,
            "gga": null,
            "ggc": null,
            "ggg": null,
            "ggt": null,
            "gta": null,
            "gtc": null,
            "gtg": null,
            "gtt": null,
            "taa": null,
            "tac": null,
            "tag": null,
            "tat": null,
            "tca": null,
            "tcc": null,
            "tcg": null,
            "tct": null,
            "tga": null,
            "tgc": null,
            "tgg": null,
            "tgt": null,
            "tta": null,
            "ttc": null,
            "ttg": null,
            "ttt": null
        },
        "dblinks": {
            "NCBI-ProteinID": [
                "CBK98661"
            ],
            "UniProt": [
                "D4JXB2"
            ]
        },
        "definition": "(GenBank) hypothetical protein",
        "division": "CDS",
        "eclinks": [],
        "entry_id": "FP2_11210",
        "gbposition": "complement(1100614..1100895)",
        "genes_id": "fpr:FP2_11210",
        "modules": {},
        "motifs": {
            "Pfam": [
                "Nod1"
            ]
        },
        "nafasta": ">FP2_11210 (GenBank) hypothetical protein\natggatagaaagttgtatccaactgcccaaaaacggcatcagaactcgaaaatcctcttcaaatctttcccattttatattacccaaaaagtgagtaaaatgcaagggtttgcacaaaaaatatcaaacaaattcggtgatttgcccgatggaatagacgtgaaattttggtataatagtttagcattcacagagaaatgtttatcgtatgacaagcagagcaacgataagaccgccgtgcttcgcgccagcggctaccatttgggaggtaaacacgcatga",
        "nalen": 282,
        "name": "",
        "names": [],
        "naseq": "atggatagaaagttgtatccaactgcccaaaaacggcatcagaactcgaaaatcctcttcaaatctttcccattttatattacccaaaaagtgagtaaaatgcaagggtttgcacaaaaaatatcaaacaaattcggtgatttgcccgatggaatagacgtgaaattttggtataatagtttagcattcacagagaaatgtttatcgtatgacaagcagagcaacgataagaccgccgtgcttcgcgccagcggctaccatttgggaggtaaacacgcatga",
        "ntfasta": ">FP2_11210 (GenBank) hypothetical protein\natggatagaaagttgtatccaactgcccaaaaacggcatcagaactcgaaaatcctcttcaaatctttcccattttatattacccaaaaagtgagtaaaatgcaagggtttgcacaaaaaatatcaaacaaattcggtgatttgcccgatggaatagacgtgaaattttggtataatagtttagcattcacagagaaatgtttatcgtatgacaagcagagcaacgataagaccgccgtgcttcgcgccagcggctaccatttgggaggtaaacacgcatga",
        "ntlen": 282,
        "ntseq": "atggatagaaagttgtatccaactgcccaaaaacggcatcagaactcgaaaatcctcttcaaatctttcccattttatattacccaaaaagtgagtaaaatgcaagggtttgcacaaaaaatatcaaacaaattcggtgatttgcccgatggaatagacgtgaaattttggtataatagtttagcattcacagagaaatgtttatcgtatgacaagcagagcaacgataagaccgccgtgcttcgcgccagcggctaccatttgggaggtaaacacgcatga",
        "organism": "Faecalibacterium prausnitzii L2-6",
        "organism_code": "fpr",
        "organism_id": "T02591",
        "orthologs": {},
        "pathways": {},
        "position": "complement(1100614..1100895)",
        "structure": []
    }
]
ADD COMMENT
0
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 2111 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6