blastn a query to a masked subject
0
1
Entering edit mode
2.8 years ago
LDT ▴ 340

Dear all,

I have a reference/subject that looks like this

CGAAGCTCTCCTACGNNNNNNNNNNNNNNNNNNNNNNNNNCAGTCCAGCGCC

and I want to blast on it multiple queries that look like this

>M01755:672:000000000-K43MP:1:1101:15627:1757 1:N:0:1
GGTCGAGGTCGGTGTAGCGTCGTAAGCTAATACGAAAATTAAAAATGACAAAATAGTTTGGAACTAGATTTCACTTATCTGTTTGTCGCTGGACTGACTGCACTGTTGTTTTTCATGAGAACGTAGGAGAGCTTCTTGGCCATCGGCCCAA

my intention is to reveal like this the masked area, NNNNNNNNNN, in the reference. It seems that blast does not like NNNNNNNN. Any idea?

refrence masked reference align blastn • 512 views
ADD COMMENT
0
Entering edit mode

you can consider to run blast with a custom made scoring matrix. This will require to dig a bit in the blast code but it is doable (I've done it myself in the past as we used to use a custom matrix in annotation purposes).

it would be custom as to include the N in it and give it a match score to any other nucleotide.

on the other hand I do also have the feeling there must be more appropriate tools than blast to do this (some regex matching perhaps? )

ADD REPLY

Login before adding your answer.

Traffic: 2684 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6