Question

Nucleotide Sequence-Clustering Tools

0

Entering edit mode

13.2 years ago

Raghul ▴ 200

Hi to all I have a transcriptome. I extracted CDS for all sequences both complete & partial. The amino acid usage results show bias towards particular amino acids. Few amino acids are much more than expected which clearly indicates that certain sequences or family of sequences are highly represented. Are there any tools to cluster sequences based on similarity (not duplicates) to avoid redundancy? I have registered for a tool called Usearch & waiting for a reply, still have no idea whether it could be useful!

I also want to know whether the term sequence clustering is appropriate to use here. Because there are different meanings for this word in bioinformatic analysis.

thank u raghul

dna sequence clustering • 4.9k views

ADD COMMENT • link updated 13.1 years ago by Frédéric Mahé ★ 3.2k • written 13.2 years ago by Raghul ▴ 200

Ram · Answer 1 · 2012-02-15

2

Entering edit mode

13.2 years ago

Frédéric Mahé ★ 3.2k

As indicated in How to cluster 454 reads?, you can use CD-HIT. You can also read this question: What softwares can be used for clustering nucleic acid fragments??

ADD COMMENT • link updated 5.6 years ago by Ram 45k • written 13.2 years ago by Frédéric Mahé ★ 3.2k