I have downloaded Homosapiens.ags.gz from NCBI's FTP site ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ASN_BINARY/Mammalia/ . Homosapiens.ags contains all kinds of information(included in the full report of a gene, e.g. MTF1's full report) of all human genes. Now Homo_sapiens.ags is in XML format, for example:
gene {
locus "A1BG" ,
desc "alpha-1-B glycoprotein" ,
maploc "19q13.4" ,
db {
{
db "HGNC" ,
tag
id 5 } ,
{
db "Ensembl" ,
tag
str "ENSG00000121410" } ,
{
db "HPRD" ,
tag
str "00726" } ,
{
db "MIM" ,
tag
id 138670 } } ,
syn {
"A1B" ,
"ABG" ,
"GAB" ,
"HYST2477" ,
"DKFZp686F0970" } } ,
Is there any convenient tool to extract a gene's related information like 'maploc'? I only need to input 'maploc', then the tool output a file contains all gene names and their related maploc?
This is not XML, but Abstract Syntax Notation (ASN).