Question

Is there an easier way to collect information from Cantatadb target interacting transcripts?

0

Entering edit mode

6.7 years ago

Arsenal ▴ 160

Dear coleagues,

I got a list of ~1000 alphanumeric IDs and I need to input them one by one to recover information from a web database (Cantatadb of plant non coding RNAs). I'm thinking about if it's possible to recover them all for once using web crawling. But I guess it's not simple because after inputting the IDs, I need to click on a link to go to the page containing the info I want. Any suggestions? Thanks in advance!

long non coding RNA web crawling • 1.3k views

ADD COMMENT • link 5.7 years ago by Arsenal ▴ 160

1

Entering edit mode

http://yeti.amu.edu.pl/CANTATA/

ADD REPLY • link 6.7 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Thanks for the reply. But me and the guys at the lab already got all the "quick" data from there. What we have now is only the IDs and basic informations about them. I need, specifically, answer "who are the targets of interaction of my IDs"; I can only recover this inputting the IDs one by one and clicking on the "details" link

ADD REPLY • link 6.7 years ago by Arsenal ▴ 160

2

Entering edit mode

The reason Pierre added the link is because you didn't include it in your question, which you should have.

ADD REPLY • link 6.7 years ago by Ram 44k

0

Entering edit mode

Oh. Im sorry. Thanks!

ADD REPLY • link 6.7 years ago by Arsenal ▴ 160

1

Entering edit mode

there is a download button. How about using it instead of web-scrapping everything ?

ADD REPLY • link 6.7 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Thanks for the reply. Unfortunately, the info within the download option is very unuseful. I want exactly the RNAs wich are targets of interaction.

ADD REPLY • link 6.7 years ago by Arsenal ▴ 160

1

Entering edit mode

Are you referring to blast/search functions? Perhaps you should include some additional detail about what you are doing on the site and why you think web scraping is the right option instead.

ADD REPLY • link 6.7 years ago by GenoMax 147k

0

Entering edit mode

Let me try be more clear: At the lab we already got all the "quick" data from there. What we have now is only the IDs and basic informations about them. I need, specifically, answer "who are the targets of interaction of my (RNAs) IDs"; I can only recover this inputting the IDs one by one and then clicking on the "details" link...

ADD REPLY • link 6.7 years ago by Arsenal ▴ 160

1

Entering edit mode

Follow @Ram's suggestion and contact the lab to ask if they can give you the data you need. If you are able to see it by doing some clicking on the web page then it is their in their DB somewhere.

ADD REPLY • link 6.7 years ago by GenoMax 147k

1

Entering edit mode

I'd recommend against web crawling unless no other option is feasible. Did you try talking to the developers if they can help with bulk querying? They might have an API under development. Web servers do not take kindly to crawlers and you might find your IP getting blocked.

ADD REPLY • link 6.7 years ago by Ram 44k

1

Entering edit mode

Could you perhaps select a more descriptive title?

ADD REPLY • link 6.7 years ago by WouterDeCoster 47k

1

Entering edit mode

do you have any example of 'interacting transcripts' ?

ADD REPLY • link 6.7 years ago by Pierre Lindenbaum 164k

score 1 · Accepted Answer · 2019-03-12

1

Entering edit mode

5.7 years ago

Arsenal ▴ 160

We contacted CantataDB and aftter some emails we got the exact info needed. Thanks to the CantataDB team!

ADD COMMENT • link 5.7 years ago by Arsenal ▴ 160