I am trying to analyze single-cell RNA-seq data using cellranger, from publicly available datasets/ studies which have been published. I am getting SRA files from NCBI GEO - the data set currently under consideration is GSE168388 . When I look into this data, I can see that this dataset consists of 5 samples - Healthy donor 1 , Healthy donor 2 , Covid 1, Covid 2 and Covid 3. Within each of these samples there are 4 SRA for 4 runs. So a total of 20 SRA files (5 samples * 4 runs each ) are present.
Now I understand from the cellranger website + tutorials that for every sample, cellranger count needs to be run separately, and then ultimately, the cellranger count outputs can be merged using cellranger aggr. Coming to my problem area -- when I am considering one sample, say for example, Healthy donor 1 - there are 4 SRA files ( for each of the runs) : SRR13870501, SRR13870500, SRR13870499 and SRR13870498 - and therefore I have 4 sets of fastq files for this sample. Where my confusion lies --- when I want to run cellranger count for the Healthy donor 1 sample, do I have to run cellranger count 4 times, for each of the runs , Or can I run cellranger count once instead, specifying all of the above 4 sets of fastq files in the sample id argument of cellranger count command (as shown below) .
So basically, can my command cellranger count command look something like this or not ? cellranger count --id=healthy_donor1 \ --fastqs=/path/to/fastqs \ --sample=SRR13870498,SRR13870499,SRR13870500,SRR13870501 \ --transcriptome=path/to/refdata-gex-GRCh38-2020-A
I would really appreciate any/all help on this matter , cheers :)