Hi All,
I am trying to document the histone modification data sets available in GEO for mammals. Following are the things that I want to retrieve from GEO :-
1.) Species (2). Tissue (3) Disease/normal state (if so which disease). (4.) SRA id (if available) (5) link for analysed data (if available) (6) Cell line/not (7) Type of experiment (ChIP-chip or CHIP-SEQ)
I need to retrieve this information from GEO. I have tried to look at e-utils but it is not well documented or easy to use. I was wondering if there is any other API somebody has written in PERL/Python for achieving the above task or should I instead try to write a web scraping program to achieve the same. I am sure somebody must have already done such a task. I also intend to document DNA methylation (MEDIP-Seq/MIRA-seq) that might be available in mammals and for which tissues in future.
thanks
Yeah SRAdb and GEOmetadb may not be able to satisfy my requirements. But I believe it is something somebody should have already worked on, at least someone who relies a lot on working with public data. It would be great if someone could share a web crawler he/she has already written to do above or something similar that I can modify rather than manually going through each of the entry.
Is there some information available on a GEO web page that is not in GEOmetadb? If there is, perhaps we can try to add those details. Otherwise, I suspect that a web crawler for GEO is not likely to yield more results than GEOmetadb.