How to download the sequences of a gene of all species?
1
0
Entering edit mode
5.5 years ago
sunyeping ▴ 110

Dear everyone,

How do you download the sequences of a gene of all species? For example, dUTPase is an enzyme existing in almost all organisms. How to download the coding DNA sequences of dUTPase of all organisms? I searched GenBank with the keyword "dUTPase" and it returned 73570 results. Many of these results are complete genome sequences which contains the dUTPase gene. Is there a way to extract only the dUTPase gene sequence from each of the results? Or is there a better way to get the sequences of a certain gene from all species.

Another question is: if I have get sequences of a certain gene from all species, how should I analysis the evolutionary relationship among them? I mean how do I know which sequences are the ancient sequences and which are the progeny ones?

I will be grateful if anyone can help.

Best regards,

Arthur

sequence • 1.5k views
ADD COMMENT
0
Entering edit mode

Which genbank database did you search? You will need to search specifically the 'gene' or 'protein' databases (for instance) else you will usually be given the genome result.

ADD REPLY
1
Entering edit mode
5.5 years ago
MatthewP ★ 1.4k

Hello, as far as I know, e-utils may help you, however if you want to download all species this may require some web scraping skills(I will assume you have). First you get all UIDs for your gene, then get the sequence.

ADD COMMENT

Login before adding your answer.

Traffic: 2915 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6