Hi,
I have short sequences that use degenerated nucleotide coding (such R, Y B, N, etc). I want to align these sequences to another longer sequence that uses only standard nucleotide (A, T, C , and G). As an example, the tool should align N to either A or T or C or G. So, it doesn't consider N as a different nucleotide from A or T or C or G. Do you know any alignment tool that consider the degenerated nucleotide coding?
Thanks!
I think you're best off to write something to handle this.
It is odd that your short reads have ambiguities, why is that? Is it a bad quality sequencing you are trying to salvage?
There are programs which can handle degenerated bases on the reference, would this help you?
Can you please explain to me what do you mean by handle degenerated bases on the reference. Can it consider these degenerated bases in the alignment as matches and not as mismatches -all the time?
Yes, BWBBLE does exactly that. And HISAT2 can index a reference genome + SNPs.
However, you didn't answer why your read have ambiguities. I get the feeling if you tell us exactly what you want to do you will get better answers.
It's primer sequences and I need to align them to some database.. Thank you for your suggestions, I will look at them closely and see it works for my case