Dear Community,
I am trying to align single cell data from SmartSeq platform using STARsolo. The data is a paired end and has FASTQs of 19GB and 21Gb each. It looks like following
Read1
@A00877:307:H5H77DSXY:2:1101:1325:1000 1:N:0:TGACCAAT
GNAGGGAGACGTCTACATCTGCCAAGTGGAGCACACCAGCCTGGACAGTCCTGTCACCGTGGAGTGGAAGGCACAGTCTGATTCTGCCCGGAGTAAGACATTGACGGGAGCTGGGGGCTTCATGCTGGGGCTCATCATCTGTGGAGTGGG
+
F#FFFF:FFFFFFFFFFFFFFFF:FFFFFFFFFF,FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFF,:FFFFFFFFFFF:FFFFFFFFFFFF:FFFFFFFFFFFFFF:F:F
@A00877:307:H5H77DSXY:2:1101:1380:1000 1:N:0:TGACCAAT
TNCTAGCAGCATTGGCCTTGGCAAGTCACTGGTAACTGTTTTCTGTAAAGCAGAGGTTGCCCACTTCATTAGACTGTAAGAACTGAATGAGAAAAGAGTAGGAGAGTACTCTGTAAACACAAGTGATAGGGAAGTTACCATCACCACTCC
+
F#FF:FFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFFFF
@A00877:307:H5H77DSXY:2:1101:1524:1000 1:N:0:TGACCAAT
GNGATGGGTCTTGCTATGTTGCACAGGCTGGTCTTGAACTCCTGGATTTAAGTGATCTTTTTACTCTAAAATGTAATCTAAATAATAACAAAATAAATATTGAGCTGAAGAAAAGAAAAAGGAAACAGTGATTTTCATGTCTGCTATGTG
Read 2
@A00877:307:H5H77DSXY:2:1101:1325:1000 2:N:0:TGACCAAT
TGGCTTCATGCAGGAGTTTTTTTTTTTTTTTTTTTTTTTTTTTGGTGTTTTTTATTTTTTGTGGTAAAAATAAGGGAAAAAGGTTGTAGTCAAAGTGTTAGTTAAAGTGGATTGATAAAAAAAGCAAAAATTTATAAAATAAGATAATAG
+
FFFFFFFFFFFFFFF:F,F:,FF::FFF,F:FFF:FF:F:FF,F:F,FFF::F:FF::F,FFF,,,,,F,,FF,:F,F,,FF,:::,FF,F,,F,:,,,:,,,F,,,F,,,FF,,,FF:F:,F,FF,,,,,,:F,:,FF,:,,,,,,,:,
@A00877:307:H5H77DSXY:2:1101:1380:1000 2:N:0:TGACCAAT
GATAGACAGTGACGCCTTTTTTTTTTTTTTTTTTTTTTTTTTTTAAAGTTTTGGGGTTGGTAACTAACATAAAAATTGTTTTTAAATTGTAATAAACAAAAATTTTAAAATAAAATAATTACATTAAAATTAATGTGCCAACCAATGGTT
+
FFFFFFFFFFFFFFFFF:FFFFF,FFFF:,FFFFFFFFF:FFF,,F,,,,,,:F,,,,,,,F:,:,,,,,F,,,,:F:,FFF:,,,,:,,,,,,F,F,,,,,F,FF,,F,:,,,:,,,::,,,:,FF,F,,,,,,F:,:F,,,,,,,,,F
@A00877:307:H5H77DSXY:2:1101:1524:1000 2:N:0:TGACCAAT
GGACTATATTTACATTGTGGCTTTGCCATTTTCTAGATTTTTTTTACTTTGGACAAATTATTTAAACTCTTTGAACCTCATTGTTCTCATCTGTGCAGATGATGCTCACTTCAGAGAAGATGACGCACATACAACACATTAAACCTAGTG
I am using STARsolo to get the alignment done and I am using following command
STAR --runThreadN 16 --genomeDir ~/HumanGenomes/ --readFilesCommand zcat --readFilesIn Read1.fq.gz Read2.fq.gz --soloType SmartSeq --outSAMtype BAM Unsorted --outBAMcompression -1 --soloUMIdedup Exact --outSAMattrRGline ID:sample1
The program executes without error but I get only one cell instead of few thousand. I am new to both SmartSeq and STARsolo, so I might have missed something, pardon for my ignorance.
Looking forward to your input/guidance.
Thank you
Yes, that is correct. Since the SmartSeq protocol is plate-based and not cellular barcode-based every fastq file pair is a cell. Check whether you have different index sequences in the fastq, in the above example it would be
TGACCAAT
as suggested above.