Hello everyone,
Now I'm conducting analysis of my 10x scRNA seq data using STARsolo.
Mapping procedure seemed to work well, and output data files including Running report (Summary.csv ) were created. However, I suppose there are some strange points in the obtained results.
Running report of my analysis is following;
Number of Reads 352455678
Reads With Valid Barcodes 0.903825
Sequencing Saturation 0.755087
Q30 Bases in CB+UMI 0.964867
Q30 Bases in RNA read 0.914494
Reads Mapped to Genome: Unique+Multiple 0.922609
Reads Mapped to Genome: Unique 0.646798
**Reads Mapped to Transcriptome: Unique+Multipe Genes 0.0555096
Reads Mapped to Transcriptome: Unique Genes 0.0438785**
Estimated Number of Cells 3764
Reads in Cells Mapped to Unique Genes 13472579
Fraction of Reads in Cells 0.896591
Mean Reads per Cell 3579
Median Reads per Cell 2896
UMIs in Cells 3325376
Mean UMI per Cell 883
Median UMI per Cell 713
Mean Genes per Cell 493
Median Genes per Cell 423
Total Genes Detected 18869
I suppose the values of "Reads Mapped to Transcriptome: Unique+Multipe Genes" and "Reads Mapped to Transcriptome: Unique Genes" seems not proper. (When my colleague analyzed the same data with CellRanger, there were no problems in the sequencing and alignment quality.)
The command I executed was the following;
STAR --runThreadN 16 --genomeDir reference --soloCBwhitelist 737K-august-2016.txt --outFileNamePrefix patient1_ --readFilesCommand gzcat --soloBarcodeReadLength 1 --clip5pNbases 39 0 --soloType CB_UMI_Simple --soloCBstart 1 --soloCBlen 16 --soloUMIstart 17 --soloUMIlen 10 --readFilesIn Fastq/patient1_GEX/patient1_GEX_S4_L003_R2_001.fastq.gz Fastq/patient1_GEX/patient1_GEX_S4_L003_R1_001.fastq.gz
If anyone knows of any causes or solutions, I would appreciate it if you could enlighten me.
Best regards,
Just checking that you used the correct whitelist. List you used seems to be from an old version of 10x/cellranger release.
Thank you very much for your kind advice.
I have re-confirmed the wishlist based on your comment, but I suppose it seems to be correct...
My 10x analysis was based on the chemistry Single Cell 5' v1 and v2, so according to the 10x website (https://kb.10xgenomics.com/hc/en-us/articles/115004506263-What-is-a-barcode-whitelist), the whitelist 737k-august-2016.txt corresponds to this assay.
※When I tried another whitelist "3M-5pgex-jan-2023.txt" for Single Cell 5' v3, command did not worked and the following error message appeared.EXITING because of FATAL ERROR in input CB whitelist file: Fastq/3M-5pgex-jan-2023.txt.gz the total length of barcode sequence is 31 not equal to expected 26
If you know any sources of the updated whitelist, please let me know.
Sincerely,