Entering edit mode
7.2 years ago
kgbenn123
▴
20
Does anyone have a recommendation for a clustering program that uses neighbor-joining phylogeny to cluster a dataset of protein sequences and output a representative from each cluster? I'm thinking of something that works like cdhit, but uses NJ instead of sequence identity for clustering.
Any suggestions?
Phylip doesn't appear to have the function I'm looking for. I'm working with about 6000 seqs and want a program that can cluster through neighbor joining and output either a list of accessions that represent their respective clusters or some file format that allows me to extract them.