Question

Editing/ Adding new tags to BAM file to match 10X format

0

Entering edit mode

2.0 years ago

ashwini • 0

Hi everyone,

I have generated an alignment file from a scRNA demultiplexing pipeline and now need to match the format of the tags present in my BAM file to those present in files generated by CellRanger. Specifically I need to append a sample identifier to the cell barcode (CB) tag and duplicate the UMI tag because 10X files have UR and UB.

Current BAM output:

01_01_14__R__49_1_14__CTGCTTTG_AACGTGAT_AACCGAGA__GGCGCTTTTT__221014Su_CAGATC   0   hg38_2111123897 255 20S94M  *   0   0   GTGGTATCAACGCAGAGTGAAAGGGGACAGCTGCCCCCACGGCAGCCCTCAGGGCCCGCTGGCCCCACCTGCCAGCCCTGGCCCTTTTGCTACCAGATCCCCGCTTTTCATCTT  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:1  HI:i:1  AS:i:92 nM:i:0  GX:Z:ENSG00000153094    GN:Z:BCL2L11    pN:Z:GGCGCTTTT CR:Z:CTGCTTTG_AACGTGAT_AACCGAGA  CB:Z:01_01_14   pB:Z:49_1_14    pS:Z:   RE:A:I

I have to append s1 to the end of the CB tag (CB:Z:01_01_14_s1) and duplicate pN and rename them to UR and UB.

Is there a simple way to do this using either sed and BASH or pysam? Thanks!

cellbarcode pysam BAM bamtags • 594 views

ADD COMMENT • link 2.0 years ago by ashwini • 0