Question

Get pages of DNA from ncbi

0

Entering edit mode

6.6 years ago

orelsmail • 0

Hello everybody. Not long time ago I asked some question that helped me progress with my project. Lately, I ran into a new problem. I am using information from https://www.ncbi.nlm.nih.gov/gene/. I run a search, parse the source code to get the information I need from the table and use it.

    try {
        URL url = new URL(Vars.searchLink + Vars.searchWord);
        conn = url.openConnection();
        BufferedReader br = new BufferedReader(
                new InputStreamReader(conn.getInputStream()));
        String inputLine;
        while ((inputLine = br.readLine()) != null) 
            if(inputLine.contains("gene-id"))
                break;

After running this, inputLine will contain the line that contains the information I need.

The problem is that I cannot proceed past the first page because the rest of the pages have an identical URL. No page indication in the url, so I am stuck with that.

I need to either find a solution and get next pages or use another trick to get the information.(I tried to compare page1 and page2's source codes, but both contain a certain hexadecimal code, which cannot be guessed)

Thanks ahead!

NCBI DNA • 976 views

ADD COMMENT • link 6.6 years ago by orelsmail • 0

1

Entering edit mode

Hello orelsmail,

what kind of information do you try to extract? NCBI provides the Entrez Programming Utilities for accessing their data programaticly. So I guess using this API would it make much easier rather than parsing the source code of the webpage.

fin swimmer

ADD REPLY • link 6.6 years ago by finswimmer 16k

0

Entering edit mode

Agreed. Entrez Programming Utilities should perfectly work for most tasks.

ADD REPLY • link 6.6 years ago by Sishuo Wang ▴ 230

0

Entering edit mode

I run a search, parse the source code to get the information I need from the table and use it.
After running this, inputLine will contain the line that contains the information I need.

It is not clear what you are trying to achieve. You will need to explain with examples what you are trying to do.

ADD REPLY • link 6.6 years ago by GenoMax 147k