Hi…I am just starting out with exploring Entrez Eutils using Biopython. What I need to do is find the amino acid change for a list of rsIDs of missense SNPs. I cannot figure out how to do that. I guess the answer would lie in the xml generated by this query:
handle = Entrez.efetch(db="snp", id="6046", retmode="xml")
But when I try
record = Entrez.read(handle)
It gives me an error like: The Bio.Entrez parser cannot handle XML data that make use of XML namespaces.
I don’t know why this is happening. Maybe I am missing something obvious here…
Is it even possible to get my required information using eutils? If not, can you suggest any other means (except doing it manually for every SNP)?
Thanks in advance.
There is possibly more than one amino acid change associated with the SNP, but you can get the annotated ones from your response by looking in the RsStruct elements (or from the HGVS descriptions on NP references in the hgvs elements). E.g. calling .getElementsByTagName('hgvs') on the parsed document could be the first step. Consult some general documentation on XML DOM navigation if you need more information.
Thanks for the tip! Seems like etree can also do the job. But then back to my original question: how do I get the amino acid change from this xml? I am not very familiar with xml and was relying on the Entrez parser to do the job for me. I have no experience with etree or minidom