Question

Determine if a gene is involved in a specific developmental process

0

Entering edit mode

10.1 years ago

Robert Sicko ▴ 630

I have a list of ~2000 genes that are in copy number variations(CNVs). We suspect these CNVs to be related to the observed phenotype. I am trying to determine if (and how many) of the genes overlapped are involved in a particular development process.

Currently I'm trying to hack together a python script to text-mine using wget and iHOP (getLatestSymbolInformation) for all genes in my list. I then would search the XML response for processY and output a 1 for all genes in my list where processY was found in the iHOP response and 0 where it was not. I could run the script for a list of genes in CNVs of control subjects and see if more of the case genes are associated with processY than control genes.

#Get all xmls from iHOP
fname = raw_input('Enter the gene list filename: ')
try:
    fhand = open(fname)
    subprocess.call("mkdir iHOP_results", shell = True)
    for line in fhand:
        if line == "" : break
        gene_symbol = line.rstrip()
        iHOP_url = "http://ws.bioinfo.cnio.es/iHOP/cgi-bin/getLatestSymbolInformation?synonym=%s&ncbiTaxId=9606" % gene_symbol
        shell_cmd = "wget -O iHOP_results/%s %s" % (gene_symbol, iHOP_url) 
        #print repr(shell_cmd)
        subprocess.call(shell_cmd,shell = True)
except:
    print 'File cannot be opened:', fname
    exit()
finally:
    fhand.close()

This works if my gene file only has a couple of entries, but fails with

unable to resolve host address `ws.bioinfo.cnio.es' failed: Name or service not known.

with a large file.

Anyone have any ideas 1) how to fix my python script or 2) a better method of testing if the case gene list is more closely associated to a specific developmental process than a control gene list?

text-mining CNV python • 1.8k views

ADD COMMENT • link updated 2.9 years ago by Ram 44k • written 10.1 years ago by Robert Sicko ▴ 630

0

Entering edit mode

see Retrieve All Genes Associated With A Go Term

ADD REPLY • link 10.1 years ago by Pierre Lindenbaum 164k