Question

Cell Ranger count pipeline: use files with same name as input

0

Entering edit mode

3.1 years ago

S. Batalha • 0

Hello, I have a question regarding the input of several fastq files into 'CellRanger count' pipeline.

I performed scRNA-seq of different samples at a partner institute and the sequencing facility started by sequencing all the samples at a lower depth (to test the quality of the libraries) and only then performed a second sequencing at higher depth. For this reason, I ended up with two sets of files for each sample, which they assured me could be merged during data analysis (and in this manner the final sequencing depth would be the sum of that obtained in each sequencing event). The issue is that these 2 sets of files have the exact same name (e.g. 'sample12_S11_L002_R1_001.fastq.gz'), and I don't know if I can give them all as input for 'CellRanger count' or if the software will be confused by having duplicated file names. I am also not sure if I can just change the name of fastq.gz files to make them unique and solve this issue.

Did anyone ever run into this issue or do you have any idea of how the pipeline will deal with this?

Additionally, would it be more correct to run the 2 sets of files separately through 'CellRanger count' and then analyse them together in Seurat? I'm not very keen on using 'CellRanger aggr' because their normalization is not the same as performed by Seurat, and I would prefer to process (filter, normalize...) all the count matrices in Seurat.

Thank you!

ranger cell 10x cellranger scRNA-seq • 2.4k views

ADD COMMENT • link updated 3.1 years ago by GenoMax 147k • written 3.1 years ago by S. Batalha • 0

score 0 · Answer 1 · 2021-10-18

0

Entering edit mode

3.1 years ago

swbarnes2 14k

if the two sets of files are from the exact same 10X library, they shouldn't be integrated in any fancy way, they should just be added together within the counting function, not after. The simplest way is to cat the fastqs together, and give them a legitimate Illumina-format name. There might be a way to tell cellranger count to get files from multiple locations, but catting together is probably simpler.

I guess the other way you could do it is to change the file names so it looks like they are the same library in different lanes. Cellranger count will naturally grab all the data together.

ADD COMMENT • link 3.1 years ago by swbarnes2 14k

0

Entering edit mode

File names should in this this format after combining them (tip: the S* numbers don't mean much but make sure they are there in your final files): https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/fastq-input

ADD REPLY • link 3.1 years ago by GenoMax 147k