Question

Ncbi fastq-dump with multiple lanes and cellranger count

0

Entering edit mode

3.1 years ago

jys01012 • 0

I'm currently trying to get fastq or fastq.gz for cellranger count input

I found those 7 sra sites

And for example in https://www.ncbi.nlm.nih.gov/sra/SRX12574893[accn],

Original fastq file looks like it contains multiple lands and pair-end (16 files)

patient_A-1_S1_L003_R1_001.fastq.gz.1
patient_A-1_S1_L003_R2_001.fastq.gz.1
patient_A-2_S1_L004_R1_001.fastq.gz.1
patient_A-2_S1_L004_R2_001.fastq.gz.1

...

But after downloading the sra file, I separated it into a paired end file through fastq-dump --split-files, and found only two files.

such

SRX12574893_1.fastq
SRX12574893_2.fastq

Is there any method for split right way??

2. Sequencing those multiple files using cellranger

2-1) If using SRX12574893_1.fastq SRX12574893_2.fastq, ....
- SRX12574893_1.fastq -> SAMPLE_L001_R1_001.fastq
- SRX12574893_2.fastq -> SAMPLE_L001_R2_001.fastq
- SRX12574894_1.fastq -> SAMPLE_L002_R1_001.fastq
- SRX12574894_2.fastq -> SAMPLE_L002_R2_001.fastq
- SRX12574895_1.fastq -> SAMPLE_L003_R1_001.fastq
- SRX12574895_2.fastq -> SAMPLE_L003_R2_001.fastq

Is it right way? ...

2-2) If there's way for split tothose files
- patient_A-1_S1_L003_R1_001.fastq.gz
- patient_A-1_S1_L003_R2_001.fastq.gz
- patient_A-2_S1_L004_R1_001.fastq.gz
- patient_A-2_S1_L004_R2_001.fastq.gz ....
- patient_B-1_S1_L003_R1_001.fastq.gz
- patient_B-1_S1_L003_R2_001.fastq.gz
...
- patient_C-2_S1_L004_R1_001.fastq.gz
- patient_C-2_S1_L004_R2_001.fastq.gz
...

How should i change those file names??

fastq-dump ncbi cellranger • 1.4k views

ADD COMMENT • link updated 3.1 years ago by swbarnes2 14k • written 3.1 years ago by jys01012 • 0

0

Entering edit mode

Unfortunately NCBI's 10x data submission is somewhat of a wild-west and allows sumitters to do what they want. This seems to be a 10x scRNAseq data which means read 1 should only be 26 or 28 bp in length. Generally under Data Access tab some submitters provide original cellranger BAM files that can be used to reconstitute the original data. That does not seem to be the case here.

ADD REPLY • link 3.1 years ago by GenoMax 147k

0

Entering edit mode

If all the different lane data has been concatenated, that's not a problem at all.

ADD REPLY • link 3.1 years ago by swbarnes2 14k