Formatdb And Non Coding Characters
2
4
Entering edit mode
14.2 years ago
Yahan ▴ 400

Dear BioStar

I want to use a non coding character, eg X to process snp data. The X helps me to find the snp position in blast hits. However, when I format the blast database from a fasta file containing this character, it gets replaced by N. Is there a workaround for this, so that the X gets reported in the blast result?

thanks for your suggestions.

Jack

blast snp snp • 2.2k views
ADD COMMENT
5
Entering edit mode
14.2 years ago

I don't think you can do this for a DNA sequence.

The only way is to parse your blast output and to re-map your SNPs.

ADD COMMENT
0
Entering edit mode

As frustrating as it is to evaluate absolute positions from relative positions, I think Pierre is correct.

ADD REPLY
2
Entering edit mode
14.2 years ago
Paulo Nuin ★ 3.7k

You can try using the newest version of Blast and makeblastdb instead of formatdb. There's an option to use an input file with regions to be masked, -mask_data. I cannot guarantee that it will mask with Ns instead of Xs, but it's worth a shot.

EDIT: Check the blast manual here in order to find out how to create a masked database. Another idea for you would be to mask your SNPs with small caps and then search for these patterns in the blast output.

ADD COMMENT

Login before adding your answer.

Traffic: 1831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6