How to run STARsolo run over multiple samples like cellranger
1
1
Entering edit mode
19 months ago
gogeni5529 ▴ 50

I have multiple fastq files from 10x 3' v2.

I have ran already cellranger count on these samples and got the output folder with barcodes, features and matrix files, as well as the bam file (I need this one specifically).

Now I would like to run a similar mapping run with STARsolo. But I'm not sure, if I can use all the fastq files in one go or use them separately.

In cellranger I used the command below to run the counting for all 4 samples each from two lanes (see complete list below) :

cellranger count --transcriptome=refgenome4cellranger/ --id=testRun --chemistry=SC3Pv2 --fastqs=rawData/ --sample=SI-GA-B4_1,SI-GA-B4_2,SI-GA-B4_3,SI-GA-B4_4

Can I run all of them also with STAR, or do I need to do it one after the other? How can I create the mtx file from all samples afterwards, if they need to be run separately?

thanks

my samples are:

rawData/SI-GA-B4_1_S9_L001_R1_001.fastq
rawData/SI-GA-B4_1_S9_L001_R2_001.fastq
rawData/SI-GA-B4_1_S9_L002_R1_001.fastq
rawData/SI-GA-B4_1_S9_L002_R2_001.fastq

rawData/SI-GA-B4_2_S10_L001_R1_001.fastq
rawData/SI-GA-B4_2_S10_L001_R2_001.fastq
rawData/SI-GA-B4_2_S10_L002_R1_001.fastq
rawData/SI-GA-B4_2_S10_L002_R2_001.fastq

rawData/SI-GA-B4_3_S11_L001_R1_001.fastq
rawData/SI-GA-B4_3_S11_L001_R2_001.fastq
rawData/SI-GA-B4_3_S11_L002_R1_001.fastq
rawData/SI-GA-B4_3_S11_L002_R2_001.fastq

rawData/SI-GA-B4_4_S12_L001_R1_001.fastq
rawData/SI-GA-B4_4_S12_L001_R2_001.fastq
rawData/SI-GA-B4_4_S12_L002_R1_001.fastq
rawData/SI-GA-B4_4_S12_L002_R2_001.fastq
cellranger starsolo • 2.3k views
ADD COMMENT
2
Entering edit mode

From STARsolo manual. Read 2 needs to go first.

--readFilesIn Read2_Lane1.fastq.gz,Read2_Lane2.fastq.gz,Read2_Lane3.fastq.gz  Read1_Lane1.fastq.gz,Read1_Lane2.fastq.gz,Read1_Lane3.fastq.gz

and Aligning multiple runs using STARSolo

ADD REPLY
1
Entering edit mode

They will also need a loop since there are 4 separate samples.

ADD REPLY
0
Entering edit mode

Is it not possible to add them all into one run? like this:

--readFilesIn rawData/SI-GA-B4_1_S9_L001_R2_001.fastq,rawData/SI-GA-B4_1_S9_L002_R2_001.fastq,rawData/SI-GA-B4_2_S10_L001_R1_001.fastq,rawData/SI-GA-B4_2_S10_L001_R2_001.fastq rawData/SI-GA-B4_1_S9_L001_R1_001.fastq,rawData/SI-GA-B4_1_S9_L002_R1_001.fastq,rawData/SI-GA-B4_2_S10_L002_R1_001.fastq,rawData/SI-GA-B4_2_S10_L002_R2_001.fastq ...
ADD REPLY
1
Entering edit mode

That depends on whether these are the same "biological" sample sequenced over multiple lanes, then yes go ahead and combine, or if different samples that need to be keep separated. If the latter then loop.

ADD REPLY
0
Entering edit mode

Can I assume, that if it was done with cellranger in one go and not separated into groups, it can also be done similarly with STAR?

ADD REPLY
1
Entering edit mode

While cellranger uses STAR (LINK) it does not use STARsolo so you can't assume anything.

ADD REPLY
0
Entering edit mode

thanks. Didn't know that.

Do I need then to use the Seurat package afterwards in order for me to combine the three data sets together?

Or is There another way combining the output of STARsolo into one big matrix?

ADD REPLY
0
Entering edit mode

thanks for the fast reply. Yes, I know that read2 must comes first. i was just not sure, it it is ok to put all the samples together in one big run.

Can I use the --readFilesManifest option also for 10x samples, or is it reserved only for SMART-Seq runs?

ADD REPLY
2
Entering edit mode
19 months ago

If you have separate samples and need to loop through them.

SAMPLES=($(find rawData -type f -name "*.fastq.gz" | sed -E 's/_S[0-9]+_L00[1-9]_R[12]_001.fastq.gz//' | sort -u))

for sample in ${SAMPLES[@]}; do 
  R1=$(find rawData -type f -wholename "${sample}*_R1_*.fastq.gz" | sort -V | paste -sd, -)
  R2=$(find rawData -type f -wholename "${sample}*_R2_*.fastq.gz" | sort -V | paste -sd, -)
  echo $R1 $R2
done

The R1 and R2 variables will store the proper comma delimited files for each sample, so you can pass these to a STARsolo command within the loop.

ADD COMMENT
0
Entering edit mode

thanks for that

ADD REPLY

Login before adding your answer.

Traffic: 2582 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6