Question

Extracting UMIs from NEXTFLEX Small RNA-Seq reads using UMI_tools

0

Entering edit mode

10 months ago

MH85 ▴ 20

Hello, everyone

I have miRNA single-end reads from the library prepared by NEXTFLEX Small RNA-Seq Kit. The read structure after trimming Illumina adaptors is:

| UMI 1 (4 nt) | miRNA | UMI 2 (4 nt) | Adaptor | Remaining sequence |

I want to use umi_tools extract to add UMI sequences to the read name. Can I achieve this by regex:

--bc-pattern=`(?P<umi_1>.{4}).+(?P<umi_2>.{4})(?P<discard_1>TGGAATTCTCGGGTGCCAAGG){s<=1}(?P<discard_2>.+)`

Thanks a lot in advance

smallRNAseq UMI NEXTflex UMI_tools miRNA • 1.2k views

ADD COMMENT • link updated 9 months ago by i.sudbery 21k • written 10 months ago by MH85 ▴ 20

score 2 · Answer 1 · 2024-06-18

2

Entering edit mode

10 months ago

Kevin ▴ 100

A few words of caution: just make sure you're using the v3 kit, as the v4 NEXTFLEX kit doesn't use randomized bases. Moreover, understand that the distribution of 8N bases for a given miRNA are non-random, since each miRNA exhibits preference regarding which subsequences it ligates with -- so you want to be careful using this sequence as a UMI.

ADD COMMENT • link 10 months ago by Kevin ▴ 100

1

Entering edit mode

Thank you very much! I wasn't aware that version 4 doesn't use randomized bases.

each miRNA exhibits a preference regarding which subsequences it ligates with

Isn't this a common issue with any ligation-based protocol? We currently use the QIAseq small RNA kit and have also observed ligation bias for certain miRNAs.

ADD REPLY • link 10 months ago by MH85 ▴ 20

1

Entering edit mode

Yes, there's bias introduced in every small RNA-seq method, since miRs have their own secondary structures (which may block miR ends) and there's also miRNA-adapter cofolding to consider. I just meant that the randomized bases themselves can't be assumed to have a random distribution, since each miRNA has its own preference of "random" bases. Qiagen adds their UMI's during reverse transcription, so that UMI should have more of a random distribution.

ADD REPLY • link 9 months ago by Kevin ▴ 100

1

Entering edit mode

To be fair, I've never seen a UMI dataset where the UMI usage looks random.

ADD REPLY • link 9 months ago by i.sudbery 21k

score 1 · Answer 2 · 2024-06-18

1

Entering edit mode

10 months ago

i.sudbery 21k

Yes, that should work.

ADD COMMENT • link 10 months ago by i.sudbery 21k