Entering edit mode
7.7 years ago
Neuls
▴
20
I'm planing to use cd-hit to remove redundancy from a 16S mRNA dataset I have got in order to build a philogenetic tree using Phylip later on. Maybe it is a newbiew question but I wonder if i have to remove redundacy before or after doing MSA using MAFFT.
Also I wonder if the output from cd-hit can be in phylip format..
Thank you
You may want to check out
dedupe.sh
orclumpify.sh
from BBMap suite for this purpose. If you are looking to remove perfectly identical reads from a NGS dataset doing it before alignments would be best.