I just discovered the USearch and UClust.
They say that USearch "is an algorithm designed to enable high-throughput, sensitive search of very large sequence databases"
Have you tried this tool ?
Do you think it can replace blast for large sets of data ?
Hi @bilouweb. Please consider expanding your question. What are these packages for: R, python...? Maybe you could put links to their websites. It would also be a good idea to describe what data you have and what exactly you are trying to do, plus the expected result. This will give you a chance to get much better answers that in turn will help the other users even more. Cheers!
I am the author of USEARCH and UCLUST, I will be glad to answer this question if you email me directly, you can find my contact information via the USEARCH home page at www.drive5.com/usearch It really depends on the details of what you're trying to do. USEARCH and BLAST have very different strengths and weaknesses, so they're not directly comparable. USEARCH has pretty good sensitivity down to about 35/45% for proteins; for more distant relationships BLASTP is more sensitive. For proteins, global alignment (USEARCH) tends to be a better predictor of functional similarity than local alignment (BLAST) because local alignment sometimes finds a single domain embedded in proteins that otherwise have different domain organization and hence quite different function.
I will take a shot. According to Usearch website it uses a global alignment approach, which is the opposite of what Blast normally uses. The difference is that ht program won't find a lot of dissimilar sequences, as it will try to align every single nucleotide/aa in the sequence. It won't be able to completely replace Blast, but it might do the trick in some specific situations, like when you're are trying to find really close related sequences.
One extra note is that this is the same developer as Muscle, so I can see it might be a very good software and fast.
Hi @bilouweb. Please consider expanding your question. What are these packages for: R, python...? Maybe you could put links to their websites. It would also be a good idea to describe what data you have and what exactly you are trying to do, plus the expected result. This will give you a chance to get much better answers that in turn will help the other users even more. Cheers!
haven't heard of this software but sounds interesting, I'll run some tests