Question

fastq dump usage of microRNA-seq data

2

Entering edit mode

6.1 years ago

K.patel5 ▴ 150

Hello biostars,

I am having a go at using sratoolkit for the first time, and wanted to know if the code I am using is appropriate for my data. I am using the same code for multiple different types of sequencing experiments, and am not sure if this is optimal. I'll lay out the sequence types and code below. I think the -3 is irrelevant for the single end data, do not know of any negative consequences of using -3 for Single paired end. I also think there may be a parameter I am missing when applying fastq-dump to miRNA-seq data.

example of code:

./fastq-dump --outdir /fastq --split-3 -I -F -B --skip-technical SRR7663647

This code is being used on the following studies:

Study | Assay Type | Library Layout | Instument

SRP052803 | RNA-seq | PAIRED | Illumina HiSeq 2000

SRP156883 | miRNA-seq | SINGLE | NextSeq 500

SRP156882 | RNA-seq | SINGLE | NextSeq 500

SRP047031 | miRNA-seq | SINGLE | Illumina HiSeq 2000

Any help or advice will be very appreciated.

RNA-Seq microRNA fastq • 2.4k views

ADD COMMENT • link 6.1 years ago by K.patel5 ▴ 150

3

Entering edit mode

I am going to recommend Phil Ewel's sra-explorer tool (https://ewels.github.io/sra-explorer/# ). Search using your study numbers. Shopping cart model. Add sequences to cart. Get direct URL's for download of fastq files from EBI-ENA in one click. You can also get even a nice bash script to download all files.

Use those links with this guide: Fast download of FASTQ files from the European Nucleotide Archive (ENA)

ADD REPLY • link 6.1 years ago by GenoMax 152k

0

Entering edit mode

Hi @genomax, thanks for pointing out this tool to me. I am running the files I intend to download through a loop in R and would like to understand the code for fastq-dump in more depth so will not be using sra-explorer for now. But may do in the future, as it looks like it is very quick and easy to use.

ADD REPLY • link 6.1 years ago by K.patel5 ▴ 150

0

Entering edit mode

Link posted by @Santosh Anand gives a nice overview of options for fastq-dump command.

ADD REPLY • link 6.1 years ago by GenoMax 152k

2

Entering edit mode

There is no problem if you use --split-3 if the SRA entry doesn't contain paired-end reads the parameter will be ignored. I will recommend to first cache the SRA files using prefetch and then run fastq-dump.

ADD REPLY • link 6.1 years ago by Arup Ghosh 3.3k

0

Entering edit mode

Hi @arup, I had a quick google on what prefetch does. Am I right to think it is a method to increase speed of download and stop the likelihood of downloading the same SRR file multiple times? I am running my code in a loop through R so I don't think I will encounter this issue.

ADD REPLY • link 6.1 years ago by K.patel5 ▴ 150

0

Entering edit mode

fastq-dump is used to download the data - can you provide more details about what you mean by

a parameter I am missing when applying fastq-dump to miRNA-seq data

ADD REPLY • link 6.1 years ago by Sej Modha 5.3k

0

Entering edit mode

Since miRNAs are so much smaller than mRNAs, I was thinking there may be an option to target smaller reads.