How to remove miRNA duplicates and obtain read count
0
0
Entering edit mode
2.4 years ago
khq5801 ▴ 10

Hi

I would like to remove miRNA duplicates and to obtain a read count from my next generation miRNA sequencing data. I have used cd-hit-dup command to remove the duplicates, but I have been getting the error cd-hit-dup: cdhit-dup.cxx:193: int HashingDepth(int, int): Assertion 'len >= min' failed

I will really appreciate if you would provide your valuable suggestion in this regard.CD-HIT-DUP output

miRNA CD-HIT-DUP NGS CD-HIT • 784 views
ADD COMMENT
1
Entering edit mode

What kind of input are you using? fasta? Are you simply interested in removing duplicates? Since you have reads with lengths between 16 and 40 bp CD-HIT must be generating that error.

You can simply try dedupe.sh from BBMap suite if you want to dedupe the data or use clumpify.sh to get counts and do other things: Introducing Clumpify: Create 30% Smaller, Faster Gzipped Fastq Files. And remove duplicates.

ADD REPLY
0
Entering edit mode

The file is fasta and I have already trimmed the sequence to 16-40 bp.

ADD REPLY

Login before adding your answer.

Traffic: 2192 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6