How can I find genomes that have linked transcriptome data ? (bacteria)
0
0
Entering edit mode
3.4 years ago
hugo.avila ▴ 530

Dear colleagues, for a project that deals with ncRNA in bacteria I need to find genomes that present the transcriptome in addition to the genome. I've downloaded a set of genomes of interest and would like to know which of these have linked transcriptome data. Reading the NCBI documentation I found that this is possible through the e-utilities tool. However, I couldn't create a query that would let me know which (if any) of my genomes have linked RNA data. I would like to know if it is possible to do this through the NCBI tools or if there is any other tool or database that allows me to do this task.

Here is my attempt:

cat \
    NCBI_genome_IDS.txt | \
    parallel -j 1 'esearch -db nucleotide -query {} | efetch -format docsum | xtract -pattern DocumentSummary -element BioSample' | parallel -j 1 \
    'esearch -db assembly -query {} | efetch -format docsum | xtract -pattern GB_BioProjects -element BioprojectAccn' | parallel -j 1 \
    'esearch -db bioproject -query {} | efetch -format docsum | xtract -pattern DocumentSummary -element Project_Target_Material'

This command links the nucleotide > assembly > bioproject databases and prints the focus of a project. My idea was that if a genome had linked RNA data, the output besides being "genome" would also be "transcriptome". But the result is just "genome". I don't know if it's because the genomes I'm looking for don't have linked RNA data, or if my query is wrong, or if what I'm trying to do isn't possible.

Corrections in my query or suggestions of other methods will be valid answers.

bacteria esearch ncbi • 538 views
ADD COMMENT

Login before adding your answer.

Traffic: 1731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6