How to include the lane index in STARsolo count matrix barcodes.tsv?
2
0
Entering edit mode
4 weeks ago
mk ▴ 310

How can I make sure that the lane index is included in the STARsolo count matrix barcodes list?

If I run CellRanger and STARsolo on the same set of FASTQs I get full barcodes in CellRanger output:

(base) [mkarikom@gl1515 mkarikom]$ zcat 2024-05-14_rxn1_S1_Cellranger/outs/raw_feature_bc_matrix/barcodes.tsv.gz|head
AAACCAAAGAAACCAA-1
AAACCAAAGAAACCCC-1
AAACCAAAGAAACGGT-1
AAACCAAAGAAAGGTC-1
AAACCAAAGAAAGTCA-1
AAACCAAAGAAATCAG-1
AAACCAAAGAAATGGC-1
AAACCAAAGAAATTGG-1
AAACCAAAGAACAGAC-1
AAACCAAAGAACAGGT-1

But STARsolo is only reporting the nucleotide sequence without the lane index:

(base) [mkarikom@gl1515 mkarikom]$ head 2024-05-14_rxn1_S1_Solo.out/Gene/raw/barcodes.tsv
AAACCAAAGAGTACGG
AAACCAAAGATTCAGT
AAACCAGCAAGCTGGG
AAACCAGCACCGTTAG
AAACCATTCAATGCGC
AAACCCATCAATATTG
AAACCCATCCACTACT
AAACCCATCCCTGATT
AAACCCATCCGACTAA
AAACCCATCGGCTCTC
10x starsolo singlecell star alignment • 498 views
ADD COMMENT
0
Entering edit mode

Why do you want to do this? Does Illumina even make instruments anymore where you can have one sample in one lane, and a different sample in a separate lane of the same run?

ADD REPLY
0
Entering edit mode

That's a great question! I'm not trying to play dumb, it's just that my background is more on the theoretical side. I was instructed to use STARsolo instead of CellRanger and noticed the "-1"s were missing on the STARsolo barcodes so I was trying to address the discrepancy.

ADD REPLY
1
Entering edit mode
4 weeks ago

Where are you getting the info that the -1 corresponds to the lane? Because it doesn't.

From the 10X site:

The cell barcode CB tag includes a suffix with a dash separator followed by a number: AAACCCAAGGAGAGTA-1 This number denotes the GEM well, and is used to virtualize barcodes in order to achieve a higher effective barcode diversity when combining samples generated from separate GEM well channel runs. Normally, this number will be "1" across all barcodes when analyzing a sample generated from a single GEM well channel. It can either be left in place and treated as part of a unique barcode identifier, or explicitly parsed out to leave only the barcode sequence itself.

So more than likely, that suffix (or lack thereof) isn't providing you any additional info.

ADD COMMENT
0
Entering edit mode

thanks, so how do i make starsolo report the NNNNNNNN-i index of the gem well for barcode NNNNNNNN?

ADD REPLY
1
Entering edit mode

You add the suffix yourself if you really want. STARsolo will not add it for you given that it's pointless in most cases.

More often, people will add the sample name or ID as a prefix for the barcodes on import to whatever downstream analysis tools they're using to distinguish between samples clearly.

ADD REPLY
1
Entering edit mode
4 weeks ago

If you combine two samples together, the -1 and -2 at the end of cell barcodes will matter, so you know what cells came from what samples, especially because there is a non-zero chance that two different samples will have a few cells that happen to share the same cell barcode.

ADD COMMENT

Login before adding your answer.

Traffic: 1553 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6