SRAtoolkit --split-files output
2
0
Entering edit mode
2 hours ago
tony_88888 • 0

Hi,

I've been using sratoolkit for a while now but still get confused by the output at times. For example, I am trying to download the accession SRR12386358. This is paired end data and looking at the 'data access' tab it looks like they have deposited the data correctly with fastqs for read 1 and 2

link to accession https://trace.ncbi.nlm.nih.gov/Traces/index.html?view=run_browser&page_size=10&acc=SRR12386358&display=data-access

When I use fastq-dump --split-files --readids I get 3 files output. Please see headers in the picture attached. Could someone please explain to me which each one of these files is? Is there a way to get these files in a format that I can use for cellranger? I have tried --split-3 but the output is a single file and similar output using fasterq-dump.

enter image description here

sratoolkit sra fastq accession • 42 views
ADD COMMENT
0
Entering edit mode
2 hours ago
GenoMax 146k

_1 file is the Illumina barcode for the sample, which is not used by cellranger.
_2 file is the Cellbarcode + UMI
_3 is the RNA read

If you had used -F with normal fastq-dump you would have removed @SRR part and ended up with normal Illumina fastq headers.

ADD COMMENT
0
Entering edit mode
1 hour ago
tony_88888 • 0

Thank you very much for the response.

Do you have any idea why I got that output and not the expected read 1 and read 2? Can these files still be used as they are in cellranger or do I need to try and download them again with sratoolkit?

ADD COMMENT

Login before adding your answer.

Traffic: 1199 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6