Entering edit mode
6.7 years ago
Arsenal
▴
160
Dear coleagues,
I got a list of ~1000 alphanumeric IDs and I need to input them one by one to recover information from a web database (Cantatadb of plant non coding RNAs). I'm thinking about if it's possible to recover them all for once using web crawling. But I guess it's not simple because after inputting the IDs, I need to click on a link to go to the page containing the info I want. Any suggestions? Thanks in advance!
http://yeti.amu.edu.pl/CANTATA/
Thanks for the reply. But me and the guys at the lab already got all the "quick" data from there. What we have now is only the IDs and basic informations about them. I need, specifically, answer "who are the targets of interaction of my IDs"; I can only recover this inputting the IDs one by one and clicking on the "details" link
The reason Pierre added the link is because you didn't include it in your question, which you should have.
Oh. Im sorry. Thanks!
there is a
download
button. How about using it instead of web-scrapping everything ?Thanks for the reply. Unfortunately, the info within the download option is very unuseful. I want exactly the RNAs wich are targets of interaction.
Are you referring to blast/search functions? Perhaps you should include some additional detail about what you are doing on the site and why you think web scraping is the right option instead.
Let me try be more clear: At the lab we already got all the "quick" data from there. What we have now is only the IDs and basic informations about them. I need, specifically, answer "who are the targets of interaction of my (RNAs) IDs"; I can only recover this inputting the IDs one by one and then clicking on the "details" link...
Follow @Ram's suggestion and contact the lab to ask if they can give you the data you need. If you are able to see it by doing some clicking on the web page then it is their in their DB somewhere.
I'd recommend against web crawling unless no other option is feasible. Did you try talking to the developers if they can help with bulk querying? They might have an API under development. Web servers do not take kindly to crawlers and you might find your IP getting blocked.
Could you perhaps select a more descriptive title?
do you have any example of 'interacting transcripts' ?