Download FASTA sequences for known viral reference genomes
1
0
Entering edit mode
3.1 years ago
adam • 0

I am trying to align unmapped reads in a whole human genome sequence to viral genomes. I stumbled across NCBI's repository of reference genomes at the link below:

https://www.ncbi.nlm.nih.gov/genome/browse/#!/overview/

Is there a way I can download the FASTA sequences for each of these reference genomes? I would only need DNA based viruses, and ones that infect humans

reference-genome virus • 1.5k views
ADD COMMENT
0
Entering edit mode

Thank you for the feedback, I just went back and validated all helpful answers from my previous questions

ADD REPLY
1
Entering edit mode
3.1 years ago
GenoMax 147k

Take a look at this report file for viral genomes.

I would only need DNA based viruses, and ones that infect humans

You can filter/parse out entries you need from it. Then download the genome sequence using EntrezDirect:

$ efetch -db nuccore -id NC_030449.1 -format fasta
>NC_030449.1 Unidentified circular ssDNA virus, complete genome
AAGTTCTGAGCGTAGGGCTAGTAGATAACGCACGTAGCTCAGATGACCGACGAAGACGGTCCTCGAGTAA
GAAACATAGTATTCATCATCAACGCAATCGATGGGGACGACCTTGTGTCCCAATTGCGTTTGCTCGATTT
CCAACACCCCACGTGGAAGCACGTAAAGTACTGTATCTATCAGCGAGAGTGCGGGGGTGAGGAGAATCGA
ATCCACTTCCAGGGCTACATGGAGTTCCACATCCAGCAGACCTATAAGCAGATTCATGCCATGGAAGGGA
ADD COMMENT
0
Entering edit mode

Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2107 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6