Question

Unable to decide R1 and R2 from SRA data for Cell Ranger 10X count

0

Entering edit mode

2.0 years ago

Srinka ▴ 20

4 reads have been shown for the SRR id

When we use fastq-dump --split-files to download 4 files are downloaded with SRR*_1.fq, SRR*_2.fq, SRR*_3.fq, SRR*_4.fq format. I want suggestions to select files for R1 and R2 for cellranger count command.

SRA CellRanger • 2.2k views

ADD COMMENT • link updated 2.0 years ago by Ming Tommy Tang ★ 4.5k • written 2.0 years ago by Srinka ▴ 20

score 3 · Answer 1 · 2022-12-28

3

Entering edit mode

2.0 years ago

GenoMax 148k

This appears to be 10x scATACseq data.

From 10x :

For the Single Cell ATAC chemistry, the barcode is sequenced as part of the i5 index read. Both mkfastq and bcl2fastq conventionally associate R2 with the i5 index read, and R3 with read2. Thus read 1, barcode, read 2, sample index are associated with R1, R2, R3, I1 respectively. This is reflected in the output files shown in the output examples in this guide.

Read 1 --> R1file
10x ATACbarcode --> R2 file
Read 2 --> R3 file
Illumina index --> I1 file

ADD COMMENT • link 2.0 years ago by GenoMax 148k

1

Entering edit mode

Side track: 10X FASTQ convention seems to be pretty complicated - I've run into the following conventions so far:

I1,R1,R2 - scRNA
I1,I2,R1,R2 - Visium Spatial RNA
I1,R1,R2,R3 - scATAC

Do they have a page listing these conventions somewhere?

ADD REPLY • link 2.0 years ago by Ram 44k

score 2 · Answer 2 · 2023-01-01

2

Entering edit mode

2.0 years ago

Ming Tommy Tang ★ 4.5k

Take a look at my blog post https://divingintogeneticsandgenomics.rbind.io/post/understand-10x-scrnaseq-and-scatac-fastqs/

This R package can help you too https://github.com/Nusob888/fasterqParseR

ADD COMMENT • link 2.0 years ago by Ming Tommy Tang ★ 4.5k

1

Entering edit mode

Your blog post has been super useful to me in understanding what goes in each of the [IR][1-3] files. The images make it easy to visualize the process. Thank you so much!