Entering edit mode
6.5 years ago
hothriananya
▴
70
Hi, I am looking for a software that performs similarly to QIAgen software Geneglobe to identify UMI sequences from a miRNA sequence data. I am also trying to understand the differences between Unique molecular identifiers and unique molecular indexes. Are these both terms refer the same? I googled and did a bit of search but all the answers left me in the confused state. Any suggestions of softwares or reference to papers are appreciable. Thanks in advance, Hothri
Past thread that may help: Use fastp to preprocess FASTQ data with unique molecular identifer (UMI) integrated
There is also UMItools from @Ian Sudbery: https://github.com/CGATOxford/UMI-tools
Thank you for the suggestion, I have used fastp but it doesnt give the number of reads with UMIs. I was confused with UMI-tools as it ask the whitelist of barcodes for identifying UMI and we dont have that list. Also I was looking for UMIs and how the list of barcodes will help to identify UMIs?
Generally speaking Unique Molecular Identifiers (UMIs), Unique Molecular Indexes (UMIs) and Random Molecular Tags (RMTs) reffer to the same thing. Although, as they are not well defined terms it is possible you will find someone using them for different purposes.
For future reference, if OP here means Unique Molecular Indices (UMIs) and sample barcodes then it can mean different things. UMIs are also called molecular barcodes those serve different purpose than sample barcodes. Sample barcodes are used for demultiplexing where as molecular barcodes are used to keep track of PCR duplicates. This link by Qiagen explains it well.
Do you know the library prep with which the sequencing was performed?
libraries are prepared with QIAseq miRNA kit. they have their own software called gene globe for identifying various small RNAs and also gives the number of reads with UMIs.
UMI-Tools will extract the UMIs from reads, and then once the reads are mapped can be used to deduplicte based on the UMIs in an error correcting manner. But what I can't find from the QIAgen material is how long the UMI is and which end of the read it is on.
Manual says
Yeah, but which end is the sequencing adaptor on? And how many bases is the UMI.
adapters are on both the ends and the UMIs are added during the reverse transcription. UMIs are 10 bases in QIAseq.
But the manual says to only do single end sequencing
https://www.qiagen.com/gb/resources/resourcedetail?id=e6795a4b-4aba-4ed3-82f9-4be98f885c1c&lang=en from this kit manual?
Yes, from there:
Second read could be ignored then. Hopefully sequencing was done for 75 bp.