I'm new to CellRanger and am doing genome alignments on a set of .fastq files which I did not generate myself. The files have are in a folder structure where there are 10 folders in total, each of the five samples L1-L5 (or SIGAA-SIGAE) have two distinct folders ending in _s1 and _s2. Now the difference between the _s1 and _s2 folders seem to be only in the numbering of the lanes, making me wonder whether they are from the same library after all. So in the _s1 folder I have the file SIGAA10_S1_L001_I1_001.fastq
and in the _s2 folder the file SIGAA10_S1_L002_I1_001.fastq
, and so on.
Hence my questions are:
- Should one always run
cellranger count
on all fastq files from the same GEX well, or can this be corrected later usingcellranger aggr
? Now I have runcellranger count
twice on the _s1 and _s2 files, but if I runcellranger aggr
, will this automatically take into account reads that are mapped to the same barcodes and correct for this (assuming the same barcodes arise in the two datasets)? - Is there a simple way of checking whether the _s1 and _s2 folders indeed correspond to different lanes from the same library?
1) If the samples are the same you can specify multiple lane numbers
cellranger count
using--lanes
option. 2) Ask the person who has demultiplexed the data or prepared the libraries.You don't have to do that. The default behavior of cellranger is to use all the lanes with the desired sample name.
The person who prepped the libraries might not have made the fastqs.
Hi, I suggested the 2nd point to make sure the person performed the experiment didn't have any specific reasons to get the lane splitted reads in two different folders.