John,
You could consider rentrez, which is a pretty low-level R library for the entrez API (which I wrote!).
In this case you would want to find links to other databases from your gene IDs, and then get the summary file for the refseq record:
library(rentrez)
links <- entrez_link(db = "all", ids = 353174, dbfrom = "gene")
names(links)
Once you have ID for the refseq record, you can fetch an (unparsed) XML file with
entrez_summary:
e_sum <- entrez_summary(db = "nuccore", id = links$gene_nuccore_refseqrna)
print(e_sum)
Presmusing this is the process you would go through, then wrapping it up in a function (and using Xpath to extract information from the XML) is straightforward:
get_entrez_length <- function(e_id) {
links <- entrez_link(db = "nuccore", ids = e_id, dbfrom = "gene")
summary_xml <- entrez_summary(db = "nuccore", id = links$gene_nuccore_refseqrna)
len <- xpathSApply(summary_xml, "//Item[@Name='Length']", xmlValue)
return(as.integer(len))
}
get_entrez_length(353174)
If you do use rentrez
and have any questions/suggestions/bugs let me know - still developing it so happy to make it more useful.