Identifying UMIs from fastq files
0
0
Entering edit mode
6.5 years ago
hothriananya ▴ 70

Hi, I am looking for a software that performs similarly to QIAgen software Geneglobe to identify UMI sequences from a miRNA sequence data. I am also trying to understand the differences between Unique molecular identifiers and unique molecular indexes. Are these both terms refer the same? I googled and did a bit of search but all the answers left me in the confused state. Any suggestions of softwares or reference to papers are appreciable. Thanks in advance, Hothri

sequencing miRNA seq • 6.5k views
ADD COMMENT
2
Entering edit mode
ADD REPLY
0
Entering edit mode

Thank you for the suggestion, I have used fastp but it doesnt give the number of reads with UMIs. I was confused with UMI-tools as it ask the whitelist of barcodes for identifying UMI and we dont have that list. Also I was looking for UMIs and how the list of barcodes will help to identify UMIs?

ADD REPLY
0
Entering edit mode

Generally speaking Unique Molecular Identifiers (UMIs), Unique Molecular Indexes (UMIs) and Random Molecular Tags (RMTs) reffer to the same thing. Although, as they are not well defined terms it is possible you will find someone using them for different purposes.

ADD REPLY
0
Entering edit mode

For future reference, if OP here means Unique Molecular Indices (UMIs) and sample barcodes then it can mean different things. UMIs are also called molecular barcodes those serve different purpose than sample barcodes. Sample barcodes are used for demultiplexing where as molecular barcodes are used to keep track of PCR duplicates. This link by Qiagen explains it well.

ADD REPLY
0
Entering edit mode

Do you know the library prep with which the sequencing was performed?

ADD REPLY
0
Entering edit mode

libraries are prepared with QIAseq miRNA kit. they have their own software called gene globe for identifying various small RNAs and also gives the number of reads with UMIs.

ADD REPLY
0
Entering edit mode

UMI-Tools will extract the UMIs from reads, and then once the reads are mapped can be used to deduplicte based on the UMIs in an error correcting manner. But what I can't find from the QIAgen material is how long the UMI is and which end of the read it is on.

ADD REPLY
0
Entering edit mode

Manual says

The reverse-transcription (RT) primer contains an integrated UMI.

ADD REPLY
0
Entering edit mode

Yeah, but which end is the sequencing adaptor on? And how many bases is the UMI.

ADD REPLY
0
Entering edit mode

adapters are on both the ends and the UMIs are added during the reverse transcription. UMIs are 10 bases in QIAseq.

ADD REPLY
0
Entering edit mode

But the manual says to only do single end sequencing

ADD REPLY
0
Entering edit mode

Yes, from there:

Next-generation sequencing on Illumina NGS systems miRNA sequencing libraries prepared with the QIAseq miRNA Library Kit can be sequenced using an Illumina NGS system (MiSeq® Personal Sequencer, NextSeq 500, HiSeq® 1000, HiSeq 1500, HiSeq 2000, HiSeq 2500 and GAIIx). QIAseq miRNA Library Kit derived libraries require 75 bp single reads. A 50 bp single read protocol can be used if there is not a desire to include the UMIs.

ADD REPLY
0
Entering edit mode

Second read could be ignored then. Hopefully sequencing was done for 75 bp.

ADD REPLY

Login before adding your answer.

Traffic: 1908 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6