Hi,
I am trying to download sequence reads for the reference genome for >100 animal species. Using EBIs REST URLs I can get FTP links using taxon names. However, all reads are returned. Is it possible to get the raw reads using an assembly accession to get the reads used in that specific assembly.
e.g. Gorilla gorilla
This returns many ftp links - I only want to get the raw reads for a specific assembly - e.g. GCA_000167515.2
Edit: This example assembly identifier I gave was from NCBI. I retrieve the raw fastq files from the EBI database, which does not recognise this directly .
Thanks for your help,
R
The python tag was to indicate that a solution in python is fine.
The example assembly identifier I gave was from NCBI. I retrieve the raw fastq files from the EBI database, which does not recognise this.
You have an ID mapping problem in your hands then. Hmmm.