BLASTn results: do not consider ambig. IUPAC chars
1
0
Entering edit mode
7.1 years ago
andzsci ▴ 40

Hello, I have a really annoying problem. I need to sort my BLAST results by percentage identity (%ident). Some of my sequences have an ambig. base in them. I want these sequences to show up as 100% identical to identical sequences that have a resolved ambig in them.

example: ATGC and ATGS = 100% ident ATGG and ATGS = 100% ident

because otherwise(?): ATGC and ATGS = 75% ident TTCG and ATGC = 75% ident

I know for sure that is correct, because my sequencing run has multiple gene copies that are amost (99,9%) identical, but sometimes I have two copies in one sequencing reaction and the basecaller calls an S, for example, if one paralogue has a C and the other one a G. This step is beyond my control. I have >100.000 sequences.

Thanks!

BLAST • 2.0k views
ADD COMMENT
0
Entering edit mode

I don't think that is possible unless you post-process your data. IUPAC codes are treated as mismatches in nucleotide alignment. See the help page here.

ADD REPLY
1
Entering edit mode
7.0 years ago
andzsci ▴ 40

I switched to USEARCH.

ADD COMMENT

Login before adding your answer.

Traffic: 1973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6