Dear all, maybe someone can help me on this matter:
- I have downloaded scRNA data from here: https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA984257&o=experiment_s%3Aa%253Bacc_s%253Bacc_s%253Bacc_s%3Bacc_s%3Aa using the SRA toolkit in bulk using the following command (cat SRR_Acc_List.txt | xargs -I{} bin/fastq-dump -I --split-files {})
- As one can see, the scRNA data have 3 runs per sample
- As each of the RUNs has four reads per spot, of course when downloading the data with the SRA comand - I get four files; so far, all good.
Now, I would like to start the nf-core scRNA pipeline (https://nf-co.re/scrnaseq/3.0.0/docs/usage/), and for this I need to write the sample sheet following a specific naming convention. The Bioproject references that the library used is "paired reads" - and here the confusion starts. How do I write the sample sheet by following the usual Illumina naming convention? My assumption is the following:
- S = is always the same for a given sample
- L001 for run 1 and L002 for run 2 and L003 for Run 3 --> so the last number changes per spot read?
- The main confusion is with the R1 and R2, since I have 3 runs per experiment? I would really appreciate some help in this. Many thanks