download files from NCBI SRA
2
1
Entering edit mode
14 months ago

Hello all,

I wanted to download data (BAM) from NCBI, there is an AWS link available along with note "Use Cloud Data Delivery". How to download this data into local system.

Thanks, Anitha

NCBI AWS SRA • 1.8k views
ADD COMMENT
0
Entering edit mode

I tend to use ENA as you just need to enter the accession number into the search bar and it provides you with ftp links for all the files. For example, here is the spreadsheet for a random project with accession PRJNA493853. If you download the TSV, it provides full ftp. In most cases, you should be able to find NCBI datasets on ENA.

ADD REPLY
0
Entering edit mode

data on SRA is most often retrieved in FASTQ format, not bam, using e.g. sratools fastq-dump, or other related tools. although the "original file that was uploaded" may be a bam file in some cases, and it asks you to use a cloud data delivery to access those (which requires you to setup a aws s3 bucket, etc), you can use fastq-dump without "cloud data delivery". example of that cloud data deliver seen e.g. here https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR26529247&display=data-access s3://sra-pub-src-12/SRR26529247/WBS_HiFi_3.bam.1 (note: this is just my perspective that it is more common to get the fastq...if other people commonly get the bam files let me know but it seems like doing your own alignment from the fastq is more common)

ADD REPLY
3
Entering edit mode
14 months ago
GenoMax 148k

If you are specifically interested in using aws links then you will need to do so using aws cli tool (or aws console, if you need a GUI program). Check help page here. Be aware that the "egress" of data from cloud (required to download) may be restricted or not free.

Following is not applicable in your case (since you are asking about BAM files, which are generally found in Data Access tab for SRA records and may be original data format submitted for 10x and such ). If you were interested in getting fastq sequence files then in most instances you will find it easier to use the http links (from Data Access tab for SRA record (or FTP links from ENA) to get the data. Use a tool like sra-explorer to get ENA links. See --> sra-explorer : find SRA and FastQ download URLs in a couple of clicks

ADD COMMENT
0
Entering edit mode
14 months ago

Alternatively, you can use the fetchngs pipeline.

ADD COMMENT
1
Entering edit mode

OP is specifically asking for

I wanted to download data (BAM) from NCBI

This pipeline is not designed for that purpose correct?

ADD REPLY
0
Entering edit mode

True, it escaped me that the OP was asking for BAM files. Might be worth creating an issue for that, though, if that feature is needed more frequently.

ADD REPLY
0
Entering edit mode

I don't know if the location for the BAM's is accessible via a command line query. The only place where I see the links is in the Data access tab on web. Like here.

ADD REPLY

Login before adding your answer.

Traffic: 1918 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6