search for the published sequences
0
0
Entering edit mode
8.5 years ago
Learner ▴ 280

Actually I would like to know how many approaches I can use to search for a specific published sequence. I know that I can use NCBI manually and I can use seqinr package in R. My question -Is there any other way to search ? - what keyword will you use to find a proper database in NCBI? - how will you recognise proper accessions?

Lets say I want to find RNAseq (this is not mature but just transcript from DNA) for Cancer leukemia and Homo species.

Looking forward to knowing your comments

sequence • 1.7k views
ADD COMMENT
0
Entering edit mode

Your searches are only going to be as good as the metadata that has been submitted to NCBI or annotated by NCBI. If the relevant terms were missing your search is going to become much harder.

The main search engine on NCBI (used to be called an Entrez, now GQuery) is still the best place to start. Try different combinations of keywords. Here is a help document from NCBI.

If you know you are looking for NG Sequencing data then going to SRA/GEO makes sense.

For "gold" standard (for the lack of a better word) data about reference genomes RefSeq is the place to be.

ADD REPLY
0
Entering edit mode

@genomax2 thanks for your comment, however, this is very basic! I am more into very specific search strategy (look at the example I gave in my question)

ADD REPLY
0
Entering edit mode

Basics still apply. Using your example terms for searching NCBI via GQuery will yield this. Plenty of SRA/GEO datasets. If you tried the same search at ENA.

As to what accessions are "right" that you will have to decide. NCBI (especially GenBank) is an archival DB and there are lots of (inaccurate) things that continue to exist.

ADD REPLY
0
Entering edit mode

Something funny I found, in their website, they wrote you should use boolean "AND" if you want to make it specific, but actually doest matter if you use or not, look at your example.

ADD REPLY
0
Entering edit mode

They know people don't read/follow directions so they are probably doing a boolean search behind the scene anyway :-)

ADD REPLY
0
Entering edit mode

It's actually pretty hard to tell from your example. If I wanted to search for Cancer, Leukaemia, Homo RNAseq; I would go to GEO and put in Leukaemia then tick the filters for Homo and RNA-seq.

ADD REPLY

Login before adding your answer.

Traffic: 1753 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6