Reliable method to detect 10x single-cell chemistry?
1
0
Entering edit mode
4 months ago
morovatunc ▴ 560

Hello folks,

I am looking for a reliable though process to determine 10x chromium chemistry if the publication has not mentioned. (3 prime vs 5 prime or v1/2/3). 3p vs 5p can be detected by forward or reverse mapping but i have seen cases where forward and reverse mapping read numbers are similar. (rather than 10% vs 90%, they are 45% or 55%). I guess, versions are the easiest because V1/V2/V3 of 3p have distinct cell barcode and UMI lengths.

I know cellranger samples couple hundred thousand reads and tries multiple configurations. I found the following repo which made it quite reliable. (tip my hat to the owner)

https://github.com/cellgeni/reprocess_public_10x/blob/main/scripts/starsolo_10x_auto.sh

For the reference this the dataset. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE139495

this is the sample SRX7065207

My question is quite vague but i would like to understand what other fellow bioinformaticians think and whats their thought process? Thank you very much,

T.

10x • 512 views
ADD COMMENT
1
Entering edit mode
4 months ago
ATpoint 86k

It's Chromium v2 I think.

Here is why: The 10x BAM file is available at https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR10355805&display=data-access -- you can wget it for a few seconds and then use samtools view that.bam | less to have a look at the tags. There is the CB (cellular barcode) tag, for example CB:Z:CATCAGAGTAGTACCT-1 which has 16 bp long. Then there is the UMI sequence in UR, for example UR:Z:GCCCAGATTG which is 10bp long. DOuble-checking with STARsolo config for 10x Chromium v1, v2, v3 or the 10x documentation narrows this down to v2 as it matches the expected CB and UMI lengths. By the way, it is the easiest to download that bam file and run the 10x bam2fastq application to get fastq.

ADD COMMENT
0
Entering edit mode

Thank you for your fast response. How about 3p-v2 vs 5p-v1 or 5p-v2? As far as I know 5p chemistry also uses 16/10 barcode/umi lengths.

By the way, it is the easiest to download that bam file and run the 10x bam2fastq application to get fastq.

I agree. I also found that sometimes bam2fastq split to so many little fastqs doubles the storage. I found that STARsolo can take bam as input. All you have to do is basically set cell barcode and umi sequence and quality BAM tags and it works like a charm. (Sorry selfishly, we are trying to generate splice/unspliced counts so must use STARsolo/alevin to reprocess the fastq/bam.

ADD REPLY
1
Entering edit mode

Aren't these kits dual-indexed? This one here is single-index. You see this in the bam as well. R1/R2/I1 only.

ADD REPLY
0
Entering edit mode

Yes. You are very right. Thank you very much!. I totally ignore the indexes.

ADD REPLY

Login before adding your answer.

Traffic: 1505 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6