Question

tool to analysis bulk RNAseq data with UMIs

3

Entering edit mode

5.2 years ago

Sara ▴ 270

I have bulk RNAseq data and in the protocol, they also used UMIs. I am looking for a tools which is able to deal with UMIs in bulk RNAseq but did not find any (they are all made for single cell RNAseq). so, my question is that, is there any tool available to work with the bulk RNAseq data with UMIs?

next-gen • 5.8k views

ADD COMMENT • link updated 5.2 years ago by i.sudbery 20k • written 5.2 years ago by Sara ▴ 270

GenoMax · Answer 1 · 2019-10-31

5

Entering edit mode

5.2 years ago

i.sudbery 20k

UMI-tools can handle any UMI tagged sequencing data where deduplication happens after mapping https://umi-tools.readthedocs.io/en/latest/index.html.

The process is to extract the UMIs from the read sequence and add it to the read names. There are two ways to do this, and between them provide the flexibility to handle any read configuration I can think of (see https://umi-tools.readthedocs.io/en/latest/regex.html)

You then map your reads with your favourite mapper.

The next step depends on whether your technique fragments the cDNA before or after PCR. If fragmentation happens after PCR, then the next step is to assign reads to features (e.g. genes) using featureCounts. If PCR happened after fragmentation, then you do the read assignment/quantification after deduping.

Then you group/dedup/count (depending on your downstream application). If fragmentation happened after PCR then you need to do this on a per-gene basis.

ADD COMMENT • link 5.2 years ago by i.sudbery 20k

0

Entering edit mode

Hi, I have collected my HTS data (single-end) of E.coli ribosome (full) using the Illumina platform. I found UMI-tools is very interesting and useful. I have used 18nt random barcode at 5'-end for avoiding the read duplication. I want to count the number of UMIs and reads at each position after mapping with a reference sequence. I have read the manual of UMI-tools, but couldn't figure out the solution: can you please suggest me how can I proceed. I'm providing an example showing what is my aim and how much I have understood:

Say, I have extracted the random barcode (18nt) from the 5'- end of each reads at the head ('_' seperated) like below using UMI-tools. Then I'll do mapping with the reference sequence using bowtie -2 . Now, I want to count the number of reads at each position of the reference and the barcodes which were unique to those reads from the SAM/BAM file. That means, I want to get the number of molecules at each position and their UMIs. For example, if I get 100 reads at 15th position and those 100 reads contained 75 types of unique barcodes, e.g., I want to get the number of reads (100) and unique barcodes (75) at each position (here 15th).

@ST-E00205:943:HCF3YCCX2:4:1101:11495:1678_CCAGCCCAAAGCCACCCG 1:N:0:NCCACGCG+NGATCTCG ACCGGATGGTAGACCTGGAGGAGGGGAAAGCCGAGGTGGTGACGGGAGCGGCTGGGGGGGGAGTCCGGGATGGTAGGCGGAGCGGGCAGAGCACAGCAGCTCGTGTAGAAATGG
+
7-<--7--7-7F-----77----7---7-------------------7----77-7-----7------7---------7-7------7--7----77----------77-7---

ADD REPLY • link updated 4.5 years ago by GenoMax 148k • written 4.5 years ago by naeem40thju ▴ 10

0

Entering edit mode

This is a separate question. Can you please start a new post.

ADD REPLY • link 4.5 years ago by i.sudbery 20k

0

Entering edit mode

Okay, thank you very much.

ADD REPLY • link 4.5 years ago by naeem40thju ▴ 10

score 1 · Answer 2 · 2019-10-30

1

Entering edit mode

5.2 years ago

swbarnes2 14k

Have you looked at umi_tools?

ADD COMMENT • link 5.2 years ago by swbarnes2 14k

0

Entering edit mode

@swbarnes2: I think umi_tools is only for scRNAseq, right?

ADD REPLY • link 5.2 years ago by Sara ▴ 270

2

Entering edit mode

scRNA data is still normal sequence. Depending on the scheme you are using for your UMI's you should be able to apply umi_tools. See the FAQ for examples of regular expressions you can use.

ADD REPLY • link 5.2 years ago by GenoMax 148k

1

Entering edit mode

UMI-tools was actually first created to analyse iCLIP data! Absolutely no reason it shouldn't work with bulk RNA-seq, infact we are analysing some UMI-tagged bulk RNAseq data with it ourselves right now.

ADD REPLY • link 5.2 years ago by i.sudbery 20k

1

Entering edit mode

It pulls the UMI out of a read and puts it in the read name; That's not specific to scRNASeq

ADD REPLY • link 5.2 years ago by swbarnes2 14k