UMI read consensus calling
1
0
Entering edit mode
4.8 years ago
SemiQuant ▴ 80

Hi

I have amplicon sequence data where I included a UMI (unique molecular identifier) on my reads to allow me to correct sequencing errors. I have removed the UMIs from the reads and added them to a tag in the fastq files. I have then aligned the reads to my reference and would now like to make consensus reads for those with the same UMI, i.e., that arose from the same DNA molecule. The sequence data is very noisy and there are many indels in the reads.

I have tried using fgbio but this cannot handle indels. I have also tried gencore, which is for pair-end read data but should work using the UMIs for single reads, however, it did nothing to the data, even when running on the least stringent setting possible. Does anyone know of a tool that can do what I need?

next-gen umi unique molecular identifier nanopore • 3.0k views
ADD COMMENT
0
Entering edit mode
4.8 years ago

You might try looking at Calib (https://academic.oup.com/bioinformatics/article-abstract/35/11/1829/5142725). We have applied for funding to employ someone to implement this in UMI-tools, but as of now it is not implemented.

ADD COMMENT
0
Entering edit mode

Thanks, but Cablib can only deal with pair-end reads (I wasn't clear in my initial question) so I cant use it without a lot of customization (or maybe I can filter by length and then just split the fatqs?). I hope you get the funding to implement it in UMI-tools; I'm sure it would be used a lot with the increase in nanopore assays!

ADD REPLY

Login before adding your answer.

Traffic: 2929 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6