SmartSeq2 with STARsolo gives only one cell
1
0
Entering edit mode
2.8 years ago
Dataminer ★ 2.8k

Dear Community,

I am trying to align single cell data from SmartSeq platform using STARsolo. The data is a paired end and has FASTQs of 19GB and 21Gb each. It looks like following

Read1

@A00877:307:H5H77DSXY:2:1101:1325:1000 1:N:0:TGACCAAT
GNAGGGAGACGTCTACATCTGCCAAGTGGAGCACACCAGCCTGGACAGTCCTGTCACCGTGGAGTGGAAGGCACAGTCTGATTCTGCCCGGAGTAAGACATTGACGGGAGCTGGGGGCTTCATGCTGGGGCTCATCATCTGTGGAGTGGG
+
F#FFFF:FFFFFFFFFFFFFFFF:FFFFFFFFFF,FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFF,:FFFFFFFFFFF:FFFFFFFFFFFF:FFFFFFFFFFFFFF:F:F
@A00877:307:H5H77DSXY:2:1101:1380:1000 1:N:0:TGACCAAT
TNCTAGCAGCATTGGCCTTGGCAAGTCACTGGTAACTGTTTTCTGTAAAGCAGAGGTTGCCCACTTCATTAGACTGTAAGAACTGAATGAGAAAAGAGTAGGAGAGTACTCTGTAAACACAAGTGATAGGGAAGTTACCATCACCACTCC
+
F#FF:FFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFFFF
@A00877:307:H5H77DSXY:2:1101:1524:1000 1:N:0:TGACCAAT
GNGATGGGTCTTGCTATGTTGCACAGGCTGGTCTTGAACTCCTGGATTTAAGTGATCTTTTTACTCTAAAATGTAATCTAAATAATAACAAAATAAATATTGAGCTGAAGAAAAGAAAAAGGAAACAGTGATTTTCATGTCTGCTATGTG

Read 2

@A00877:307:H5H77DSXY:2:1101:1325:1000 2:N:0:TGACCAAT
TGGCTTCATGCAGGAGTTTTTTTTTTTTTTTTTTTTTTTTTTTGGTGTTTTTTATTTTTTGTGGTAAAAATAAGGGAAAAAGGTTGTAGTCAAAGTGTTAGTTAAAGTGGATTGATAAAAAAAGCAAAAATTTATAAAATAAGATAATAG
+
FFFFFFFFFFFFFFF:F,F:,FF::FFF,F:FFF:FF:F:FF,F:F,FFF::F:FF::F,FFF,,,,,F,,FF,:F,F,,FF,:::,FF,F,,F,:,,,:,,,F,,,F,,,FF,,,FF:F:,F,FF,,,,,,:F,:,FF,:,,,,,,,:,
@A00877:307:H5H77DSXY:2:1101:1380:1000 2:N:0:TGACCAAT
GATAGACAGTGACGCCTTTTTTTTTTTTTTTTTTTTTTTTTTTTAAAGTTTTGGGGTTGGTAACTAACATAAAAATTGTTTTTAAATTGTAATAAACAAAAATTTTAAAATAAAATAATTACATTAAAATTAATGTGCCAACCAATGGTT
+
FFFFFFFFFFFFFFFFF:FFFFF,FFFF:,FFFFFFFFF:FFF,,F,,,,,,:F,,,,,,,F:,:,,,,,F,,,,:F:,FFF:,,,,:,,,,,,F,F,,,,,F,FF,,F,:,,,:,,,::,,,:,FF,F,,,,,,F:,:F,,,,,,,,,F
@A00877:307:H5H77DSXY:2:1101:1524:1000 2:N:0:TGACCAAT
GGACTATATTTACATTGTGGCTTTGCCATTTTCTAGATTTTTTTTACTTTGGACAAATTATTTAAACTCTTTGAACCTCATTGTTCTCATCTGTGCAGATGATGCTCACTTCAGAGAAGATGACGCACATACAACACATTAAACCTAGTG

I am using STARsolo to get the alignment done and I am using following command

STAR --runThreadN 16 --genomeDir ~/HumanGenomes/ --readFilesCommand zcat --readFilesIn Read1.fq.gz Read2.fq.gz --soloType SmartSeq  --outSAMtype BAM Unsorted --outBAMcompression -1 --soloUMIdedup Exact --outSAMattrRGline ID:sample1

The program executes without error but I get only one cell instead of few thousand. I am new to both SmartSeq and STARsolo, so I might have missed something, pardon for my ignorance.

Looking forward to your input/guidance.

Thank you

scRNA-seq SmartSeq STARsolo • 1.8k views
ADD COMMENT
1
Entering edit mode
2.6 years ago

Hi Dataminer,

I've not worked with Smart-seq scRNA-seq before, but from what i can gather from the documentation you would require separate fastq files for each cell, combined with a manifest file that lists the barcode sample/cell type combinations.

Now, in the PE reads that you've supplied there are barcodes listed, you might need to split these files by barcode. several methods are possible, see e.g. the answer from finswimmer in this thread Split fastq according to barcodes

Let me know if this worked.

Kind regards, Thomas

ADD COMMENT
0
Entering edit mode

Yes, that is correct. Since the SmartSeq protocol is plate-based and not cellular barcode-based every fastq file pair is a cell. Check whether you have different index sequences in the fastq, in the above example it would be TGACCAAT as suggested above.

ADD REPLY

Login before adding your answer.

Traffic: 1979 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6