Entering edit mode
5.9 years ago
Rituriya
▴
50
Hi All,
I have UMI extracted miRNA reads which I want to analyse using mirDeep2 for known and novel miRNA discovery. But when I give these as input, mapper.pl collapses reads drastically and brings down 7 million reads to 1200 reads with count tag on each read.
My question:
1) Is there a way to retain UMI information on the Fastq header so as to do deduplication later post alignment by mirdeep2?
2) Has anyone successfully analyzed UMI tagged miRNA data using mirDeep2?
Thank you, Pratibha.
You might need to align to a genome with a regular aligner, use something like UMI_tools on that alignment, then put those reads through mirDeep2.
That was exactly my thought process initially, swbarnes2. But if I want to do that, mirdeep2.pl requires a reads_collapsed.fa and .arf file mandatorily. Shall I convert bowtie output to arf format using command:
Let me try and see if it works downstream. Do let me know if you think otherwise.
Thanks, Rituriya.
I found bwa_sam_converter.pl to be exactly what I need but it does not retain UMI tag in header (so I am back to square one, where I will have to manually add that using programming) and also I am unable to run the perl script:
It always throws an error: Sam file not found. I checked the file is very much there. Further, I learnt that all SAM files will not work from this link, but I need to know which SAM fields need to be present/absent. Any idea?