When we use fastq-dump --split-files to download 4 files are downloaded with SRR*_1.fq
, SRR*_2.fq
, SRR*_3.fq
, SRR*_4.fq
format. I want suggestions to select files for R1 and R2 for cellranger count command.
When we use fastq-dump --split-files to download 4 files are downloaded with SRR*_1.fq
, SRR*_2.fq
, SRR*_3.fq
, SRR*_4.fq
format. I want suggestions to select files for R1 and R2 for cellranger count command.
This appears to be 10x scATACseq data.
From 10x :
For the Single Cell ATAC chemistry, the barcode is sequenced as part of the i5 index read. Both mkfastq and bcl2fastq conventionally associate R2 with the i5 index read, and R3 with read2. Thus read 1, barcode, read 2, sample index are associated with R1, R2, R3, I1 respectively. This is reflected in the output files shown in the output examples in this guide.
Read 1 --> R1file
10x ATACbarcode --> R2 file
Read 2 --> R3 file
Illumina index --> I1 file
Take a look at my blog post https://divingintogeneticsandgenomics.rbind.io/post/understand-10x-scrnaseq-and-scatac-fastqs/
This R package can help you too https://github.com/Nusob888/fasterqParseR
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Side track: 10X FASTQ convention seems to be pretty complicated - I've run into the following conventions so far:
Do they have a page listing these conventions somewhere?