Getting java.lang.NumberFormatException while " Rs rs=unmarshaller.unmarshal(reader, Rs.class).getValue(); "
1
0
Entering edit mode
10.1 years ago
burcakotlu ▴ 40

Hi,

I have downloaded the docsum_3.4.xsd from ftp://ftp.ncbi.nlm.nih.gov/snp/specs/

After generating classes from xsd file, I have sent list of rsIds in batches and tried to get the rsInformation such as start and end positions, etc.

However while unmarshalling Rs class (after working for a number of rsIDs successfully)

I got java.lang.NumberFormatException.

java.lang.NumberFormatException: Not a number: u
    at com.sun.xml.internal.bind.DatatypeConverterImpl._parseInt(Unknown Source)
    at com.sun.xml.internal.bind.v2.model.impl.RuntimeBuiltinLeafInfoImpl$17.parse(Unknown Source)
    at com.sun.xml.internal.bind.v2.model.impl.RuntimeBuiltinLeafInfoImpl$17.parse(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.reflect.TransducedAccessor$CompositeTransducedAccessorImpl.parse(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StructureLoader.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.InterningXmlVisitor.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXEventConnector.handleStartElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXEventConnector.bridge(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(Unknown Source)

What can be the reason? Any idea?

Related my code is below: After calling Rs rs=unmarshaller.unmarshal(reader, Rs.class).getValue();

I got java.lang.NumberFormatException.

String uri="http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id="+commaSeparatedRsIdList+"&retmode=xml";            


XMLEventReader reader= xmlInputFactory.createXMLEventReader(new StreamSource(uri)); 

while(reader.hasNext())
{
    XMLEvent evt=reader.peek();

    if(!evt.isStartElement())
    {
        reader.nextEvent();
        continue;
    }

    StartElement start=evt.asStartElement();
    String localName=start.getName().getLocalPart();


    if(!localName.equals("Rs"))

    {
        reader.nextEvent();
        continue;
    }

    Rs rs=unmarshaller.unmarshal(reader, Rs.class).getValue();    
......

Thanks in advance,
Burçak

efetch unmarshal • 4.1k views
ADD COMMENT
0
Entering edit mode

marshalling is about converting data between two programming environments, and has to convert all variables from one format to another. It's going to cause strange errors with no clear resolution because the two pieces don't line up. You'll have to find out what each end of the code is inputting and outputting.

When the data is invalid, then such a Java Exception is the correct and expected behavior. Your code has to handle the event that an unmarshal fails. Right now your code assumes everything is clean, which is nonsense for XML over the web. You should expect broken data.

Inspect the returned XML object, it apparently has a letter 'u' where you expected a number, hence the "number format exception".

ADD REPLY
1
Entering edit mode
10.1 years ago

NCBI Efetch uses 3.3 NOT 3.4:

(...)

<ExchangeSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.ncbi.nlm.nih.gov/SNP/docsum" xsi:schemaLocation="http://www.ncbi.nlm.nih.gov/SNP/docsum ftp://ftp.ncbi.nlm.nih.gov/snp/specs/docsum_3.3.xsd" >

Can you please compile your xsd using 3.3 instead of 3.4.

If it still fails, can you tell us the rs##

EDIT:

The problem comes from the NCBI, the rs# you provided does not validate vs the schema:

$  curl -s "http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=rs9488822,rs867186,rs9390459,rs10953541,rs11556924,rs11669133,rs12190287,rs1231206,rs12413409,rs1412444&retmode=xml" | xmllint --format --schema <(curl -s "ftp://ftp.ncbi.nlm.nih.gov/snp/specs/docsum_3.3.xsd") --noout -
-:2: element MapLoc: Schemas validity error : Element '{http://www.ncbi.nlm.nih.gov/SNP/docsum}MapLoc', attribute 'refAllele': The attribute 'refAllele' is not allowed.
-:2: element FxnSet: Schemas validity error : Element '{http://www.ncbi.nlm.nih.gov/SNP/docsum}FxnSet', attribute 'soTerm': The attribute 'soTerm' is not allowed.

and

$ xmllint --format --schema <(curl -s "ftp://ftp.ncbi.nlm.nih.gov/snp/specs/docsum_3.4.xsd") --noout "http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=rs9488822,rs867186,rs9390459,rs10953541,rs11556924,rs11669133,rs12190287,rs1231206,rs12413409,rs1412444&retmode=xml"

http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=rs9488822,rs867186,rs9390459,rs10953541,rs11556924,rs11669133,rs12190287,rs1231206,rs12413409,rs1412444&retmode=xml:2: element FxnSet: Schemas validity error : Element '{http://www.ncbi.nlm.nih.gov/SNP/docsum}FxnSet': Missing child element(s).
http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=rs9488822,rs867186,rs9390459,rs10953541,rs11556924,rs11669133,rs12190287,rs1231206,rs12413409,rs1412444&retmode=xml:2: element FxnSet: Schemas validity error : Element '{http://www.ncbi.nlm.nih.gov/SNP/docsum}FxnSet': Missing child element(s).

I've sent a bug report to the NCBI.

EDIT: and here's the answer:

Hi Pierre,

We'll update eutils for the next Entrez release cycle which could take a couple of weeks. In the meantime, please use docsum_3.4.xsd (ftp://ftp.ncbi.nlm.nih.gov/snp/specs/docsum_3.4.xsd).

In regards to xmllint, it has a known issue and needs to be updated to the v2.9.1 to work correctly.

http://stackoverflow.com/questions/18409365/is-xmllint-handling-a-nillable-compextype-correctly

Regards,

Lon Phan

You said there is a problem with an integer for rs1412444

java.lang.NumberFormatException: Not a number: f33787421

However I see no problem in the XML when grep'ing for 787421

curl "http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=1412444&retmode=xml" | xmllint --format - | grep 787421

      <Component componentType="contig" accession="NT_030059.14" chromosome="10" start="41693521" end="133787421" orientation="fwd" gi="568815276" groupTerm="NC_000010.11" contigLabel="GCF_000001405.26">
ADD COMMENT
0
Entering edit mode

Hi,

Question 1) How can I learn which docsum version to use with a certain NCBI eutil? How do we know this?

Question 2) I have generated classess using docsum_3.3.xsd; I'm sending commaSeparatedListofRsIDs such as

example:

rs9488822,rs867186,rs9390459,rs10953541,rs11556924,rs11669133,rs12190287,rs1231206,rs12413409,rs1412444

I got error for rs1412444

java.lang.NumberFormatException: Not a number: f33787421

    at com.sun.xml.internal.bind.DatatypeConverterImpl._parseInt(Unknown Source)
    at com.sun.xml.internal.bind.v2.model.impl.RuntimeBuiltinLeafInfoImpl$17.parse(Unknown Source)
    at com.sun.xml.internal.bind.v2.model.impl.RuntimeBuiltinLeafInfoImpl$17.parse(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.reflect.TransducedAccessor$CompositeTransducedAccessorImpl.parse(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StructureLoader.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.InterningXmlVisitor.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXEventConnector.handleStartElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXEventConnector.bridge(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(Unknown Source)

Thanks in advance

ADD REPLY
0
Entering edit mode

But the point is that, when I tried the code again, this time it works fine for rs1412444 and it fails at another rsID which is rs7221109 (seems non deterministic)

java.lang.NumberFormatException: Not a number: b5859386
    at com.sun.xml.internal.bind.DatatypeConverterImpl._parseInt(Unknown Source)
    at gov.nih.nlm.ncbi.snp.docsum.Ss_JaxbXducedAccessor_ssId.parse(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StructureLoader.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.InterningXmlVisitor.startElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXEventConnector.handleStartElement(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXEventConnector.bridge(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(Unknown Source)
    at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(Unknown Source)
ADD REPLY
0
Entering edit mode

Answer for quesiton1)

When we write NCBI eutil link in the web browser, first line of XML output shows that which docsum it uses.

<ExchangeSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.ncbi.nlm.nih.gov/SNP/docsum" xsi:schemaLocation="http://www.ncbi.nlm.nih.gov/SNP/docsum ftp://ftp.ncbi.nlm.nih.gov/snp/specs/docsum_3.3.xsd" >
ADD REPLY
0
Entering edit mode

Each time I run the program,it gives error for different rsIDs.

Do we have to report all these rsIds to NCBI?

And will they resolve these in a short time?

Is there another way to get start, end positions and observed alleles of given rsIds?

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 1571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6