Alignment as BLAST-able database/other alignment avenues
0
0
Entering edit mode
7.2 years ago
jnf3769 ▴ 40

Hey all,

I was wondering if there was a way to use an alignment as a local blast database. My problem is as follows: I have an old alignment of concatenated protein sequences that have CHARSET definitions defining the protein beginnings and endings at the bottom of the NEXUS file. Some folks doing method development type stuff created a method that does a different sort of alignment and they also removed some of the sequences from the alignment, changing the length of the overall alignment. Thus, the CHARSET definitions no longer delimit the genes. But I need them to for a downstream step. I have all the protein sequences, both as they were in their old alignment and also just the sequences themselves. My natural thought was to make a BLAST database out of the alignment and query the individual sequences to get an answer. But I don't seem to be able to make such a local database the old fashioned way. Is there a way with BLAST?

Barring that, I think MAFFT has experimental 'addsequences' and 'addfragments' functionality. But they are terribly slow and I don't know if that's appropriate for my goal. Anybody have any insight? Is MAFFT a reasonable tool for this? Is there a BLAST method? Perhaps a more traditional Comp Sci string based distance minimization approach comes to mind? I really appreciate any help/insight you all can offer. Ideally, I'd do the alignment traditionally, but, like I said, this is downstream of some folks working on some algorithmic method development--they haven't implemented a way to keep track of this stuff quite yet.

Best!

BLAST alignment MAFTT NEXUS • 1.5k views
ADD COMMENT
0
Entering edit mode

But I don't seem to be able to make such a local database the old fashioned way.

makeblastdb is the current way of making local blast databases (run the command with -help flag to get inline help). If you are familiar with BLAST you should be able to pick up the differences and run with BLAST+ easily.

ADD REPLY
0
Entering edit mode

Well, that is what I consider the old fashioned way. But in any case, you've completely missed the spirit of my question. Can't make a blast database with an alignment that way--the help flag doesn't specify such a method, at least.

https://ibb.co/cSLNKF https://ibb.co/kmcf6v

ADD REPLY
0
Entering edit mode

If you want to make a blast database out of an alignment you will need to take out the gaps from that fasta file. You could easily use sed to replace the - with nothing. If you wish to preserve the alignment then you will have to use the add sequence/add fragment method with a multiple sequence alignment program as you noted above.

ADD REPLY
0
Entering edit mode

While I thank you for your response, you again are missing the spirit of the question. The gaps are necessary. It would not be an alignment if I removed the gaps, nor would it inform me to the delineations of the protein sequences in the alignment.

ADD REPLY
0
Entering edit mode

See the modified comment above. Someone else may be along with a new comment/answer.

ADD REPLY

Login before adding your answer.

Traffic: 2500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6