Question

How can I count the barcodes of alligned reads at each position?

0

Entering edit mode

4.5 years ago

naeem40thju ▴ 10

Hi, I have appended the random barcode (18nt) from the 5'- end of each reads at the head like below. Now I want to do mapping skipping 18nt from 5'- end and 10nt from 3'-end using bowtie -2. After alignment with the reference sequence, I want to count the number of reads at each position and the barcodes which were unique to those reads from the SAM file. For example, if I get 100 reads at 15th position and those 100 reads came from 2 types of unique barcodes.

Anyone has any written python scripts to do that or can assist to perform it. Thanks in advance.

@ST-E00205:943:HCF3YCCX2:4:1101:11495:1678 1:N:0:NCCACGCG+NGATCTCG CCAGCCCAAAGCCACCCG
ACCGGATGGTAGACCTGGAGGAGGGGAAAGCCGAGGTGGTGACGGGAGCGGCTGGGGGGGGAGTCCGGGATGGTAGGCGGAGCGGGCAGAGCACAGCAGCTCGTGTAGAAATGG
+
7-<--7--7-7F-----77----7---7-------------------7----77-7-----7------7---------7-7------7--7----77----------77-7---

@ST-E00205:943:HCF3YCCX2:4:1101:1012:1696 1:N:0:NCTTGACC+NGATCTCG CANCCTCCCAAGGCGCCC
AATAAACAGTTGCAGCCCCAGATCGGAAGAGCGGTTCAGCAGGATGCCCGAAAACGATTTGGTTTGTCTTCTCAGCATTGAAAAAAAATAAGAATTAAGGCTTAATTCGGAACA
+
-FJ<JJ-JAFJ-F-AF<AJJJ<AFJFFFJFJFJJFJ-FFJ<JJF--777-7----7-----------7-7-7--7---7--7A-7---7--7-------7--7-----------

next-gen sequencing • 1.5k views

ADD COMMENT • link updated 4.5 years ago by Pierre Lindenbaum 164k • written 4.5 years ago by naeem40thju ▴ 10

0

Entering edit mode

What do CCAGCCCAAAGCCACCCG and CANCCTCCCAAGGCGCCC in the fastq header represent?

Is the barcode still in the reads? Or you have moved that to the header and that is what the oligo above is.

ADD REPLY • link 4.5 years ago by GenoMax 148k

0

Entering edit mode

Barcode is not in the read anymore. I have moved them from the read to the header to keep the record which barcode is from which read.

ADD REPLY • link 4.5 years ago by naeem40thju ▴ 10

0

Entering edit mode

Did you use umi-tools to do that? It is going to be very tricky to handle those UMI's since most aligners will simply drop/ignore them when they write the BAM files. Those would have to be transferred to the alignment using a custom SAM tag.

mapping skipping 18nt from 5'- end and 10nt from 3'-end using bowtie -2

I don't understand this requirement. Most aligners are not going to have an option to let you do that. You would need to remove those bases before alignment, if you don't want them considered for alignment. bbduk.sh from BBMap suite can hard crop reads like that.

ADD REPLY • link 4.5 years ago by GenoMax 148k

0

Entering edit mode

Thanks for your advice. I've also found the umi-tools. I am checking whether I can achieve my goal by it.

ADD REPLY • link 4.5 years ago by naeem40thju ▴ 10

0

Entering edit mode

Hi there, as far I have understood the Umi-tools, it is working very well in deduplication the reads. After duplicating, I can count the number of mapped reads at each position of the reference sequence. My aim was to count the number of umi at each position. For example, I have got 100 reads at a particular position and those 100 reads may have 1/ 2/ 3 umi (s), I would like to get this umi number at each position. Any idea, please? Thanks.

ADD REPLY • link 4.5 years ago by naeem40thju ▴ 10