Hello,
I have assembled the RNA seq lib using multi Kmer approach (Soapdenovo trans) for non model plant. now i want to merge all Kmer assemblies and remove the redundancy. Could anyone please let me know which tool would be more suitable for this exercise CAP3 or CD-HIT-EST?
only use these programs if you know what you are doing. I usually use CD-HIT-EST straight after with a 95% similarity level to remove highly similar sequences then I follow up with cap3. You should read the literature and get a feel for the settings used and try to udnerstand why they use such settings. You can also filter by FPKM thresholds, but have a good reason for it. I think Trinity has accompanying scripts for that. Usually the lower supported reads are the ones that are fine to remove, but depending on what your study aim is, you may not want to remove such reads with low counts.