I was given three files by a collaborator who is now on holiday and I'm looking for a quick answer for those who are not on holiday :)
I have three FASTQs from a 10x v2 scRNA-seq run.
The file with I1
contains what I assume is the sample index (8mer)
@E00527:118:HW5HWCCXY:7:1101:3315:1643 1:N:0:ACATTACT
ACATTACT
+
AA---<A-
@E00527:118:HW5HWCCXY:7:1101:3579:1643 1:N:0:ACATTACT
ACATTACT
+
AA<<--F<
The FASTQ with R1
seems to contain the cell barcodes and UMI
@E00527:118:HW5HWCCXY:7:1101:3315:1643 1:N:0:ACATTACT
GGACGTCCACATCCGGGCGGGTCGTCT
+
<AFF-AFFFJJ<FFJ-J<77FAAJ-A7
@E00527:118:HW5HWCCXY:7:1101:3579:1643 1:N:0:ACATTACT
GGTGAAGGTCATACGGTGTTTCTTTTT
+
The FASTQ with R2
seems to be the actual cDNA bit that was sequenced.
Am I correct with this so far?
So given the three files, how can I create a sample-specific FASTQ files. Sometimes the sequenced sample index has N
nucleotides so it's not as simple as making a new FASTQ for each sample index.
I am expecting two samples in this FASTQ file... Thanks!
Not sure if what you have received is output of
cellranger mkfastq
or just plainbcl2fastq2
? Sounds like the latter if there are more than one samples present. You can look at this page for further analysis options.Eventually you will want to use
cellranger count/aggr
to do further analysis.