I am trying to download the drosophila database to annotate variants with annovar. I have followed the instructions found here: http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ using the following commands:
annotate_variation.pl -downdb -buildver dm6 gene drosdb
annotate_variation.pl --buildver dm6 --downdb seq drosdb/dm6_seq
retrieve_seq_from_fasta.pl drosdb/dm6_refGene.txt -seqdir drosdb/dm6_seq -format refGene -outfile drosdb/dm6_refGeneMrna.fa
However, after the second command results in some of the annotation databases failing to download. Here is the output:
NOTICE: Web-based checking to see whether ANNOVAR new version is available ... Done
NOTICE: Downloading annotation database http://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/chromFa.zip ... Failed
NOTICE: Downloading annotation database http://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/chromFa.tar.gz ... Failed
NOTICE: Downloading annotation database http://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/dm6.chromFa.tar.gz ... Failed
NOTICE: Downloading annotation database http://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/dm6.fa.gz ... OK
NOTICE: Uncompressing downloaded files
NOTICE: Finished downloading annotation files for dm6 build version, with files saved at the 'drosdb/dm6_seq' directory
WARNING: Some files cannot be downloaded, including http://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/chromFa.tar.gz, http://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/chromFa.zip, http://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/dm6.chromFa.tar.gz
And consequently the third command gives me these errors for each position:
WARNING: Cannot identify sequence for NR_124579 (starting from chr2R:16812244)
Which results in a fasta file with 0 genomic regions.
I tried going directly to the FTP site to manually download the files, but as their path is different, I don't know how to address this problem. Many thanks.
An update to this for anyone else with this problem:
We contacted Kai Wang (developer of Annovar) about this and he stated that the only required file is http://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/dm6.fa.gz, i.e., the only one that can be downloaded. He also stated to ignore the subsequent warning message.
Kevin