Hi Guys,
I have a question regarding my bam file (single-cell rna seq data). The first 6 letters of the read names of my bam file represent the cellbarcode info and next 6 represent UMI (separated by an underscore and afterwards there is a hash separating it from the rest of the name). For example-
ACAGTG_GAGAAG#K182:79:HV56XX:6:1101:11160:388737da9
I would like to convert these info into the CB CB:Z:ACAGTG
and UB UB:Z:GAGAAG
bam TAGs. And then the final read name becomes- K182:79:HV56XX:6:1101:11160:388737da9
.
Can you please provide me with a one-liner awk command or some kind of tool using which I can do this?
Thanks a lot!
this works perfectly! thanks a lot!
update 2021: samtools 1.13:samtools view --add-flags/--remove-flagsCan you provide more information about how the new samtools feature --add-flags works? I would like to be able to use this feature, but for me it doesn't add a tag. For example, I tried to add a CB tag "AAAAAAAAAA" to each read in a small test bam:
Results in the original bam read (i.e., with no new CB tag):
nice catch. I was wrong
--add-flag
is about mapping flag (integer), not attributes, my solution above was wrong.