As SRAdb is outdated, I would like to create my own SRAdb. What is the best way to do that?
I am aware of the files I would need to make one: ftp://ftp.ncbi.nlm.nih.gov/sra/reports/
But, I am not an expert in SQL and don't know how to deal with XML files.
As far as I understood, Sean Davis seems to be working on a replacement of SRAdb. As I have already set up the tool to work with SRAdb, I would like to have an updated one.
There are functions in R for dealing with both SQL and XML. The ideal situation would be to pull the data in real time from SRA, no? This is how GEOquery (also by Sean Davis) functions.
Unfortunately, I do not have direct access to SRA. I mine the data from multiple publicly available locations at NCBI in what is a non-trivial exercise.
Bold text and ALL CAPS can come across as rude in online communication. For what it's worth, it definitely helps to learn SQL (and even XML). It doesn't take too long and a learning opportunity seems to have presented itself to you!
Due to growing size of SRA metadata, we have been transitioning to a new approach for building, updating, and distributing the SRA metadata. As a first step, the data are available as public resource on Google BigQuery.
Access BigQuery via any language, basically, via the bq command line, or via the cloud console. BigQuery requires a Google Cloud Platform account, but the free credits will cover thousands of queries.
If you are simply interested in searching SRA programmatically, you can try using NCBI Eutils. There is documentation on NCBI.
Contact Sean Davis and try to work with his team. No point in developing a new one? - waste of time and research funds (?)
https://github.com/seandavi/SRAdb/issues/24
As far as I understood, Sean Davis seems to be working on a replacement of SRAdb. As I have already set up the tool to work with SRAdb, I would like to have an updated one.
There are functions in R for dealing with both SQL and XML. The ideal situation would be to pull the data in real time from SRA, no? This is how GEOquery (also by Sean Davis) functions.
The ideal situation would be to pull the data in real time from SRA: Yes!
I was looking for the code which Sean Davis and his team wrote to make this database. But, couldn't find it (that would have been an easy solution)...
They have direct access to SRA, you might just ask them to update the file. I've sent Sean Davis a tweet to ask for his input here.
Unfortunately, I do not have direct access to SRA. I mine the data from multiple publicly available locations at NCBI in what is a non-trivial exercise.
If SRAdb is no longer being maintained can a note be added to the GitHub/Bioconductor page that effect? There is currently no indication there.
That's incredibly annoying, I had presumed that since you were at NIH you'd have more direct access.
Bold text and ALL CAPS can come across as rude in online communication. For what it's worth, it definitely helps to learn SQL (and even XML). It doesn't take too long and a learning opportunity seems to have presented itself to you!
Sorry, I updated mine comment!