Can I align two DNA sequences using BLAST from command line?
2
0
Entering edit mode
5.7 years ago
sj93 ▴ 10

I've created two custom databases using blastn from the command line in Ubuntu. Each database is a different genomic sequence of the same species. The output for these custom databases each contained 3 files, .nhr, .nin and .nsq. The NCBI BLAST help manual says the command for running blast is

blastn –db nt –query nt.fsa –out results.out

I've tried researching this and playing around with the above command by using the file names, locations, etc. that apply to the files I'm trying to blast, but I cannot figure out what the appropriate version of this command is for my task. Any tips? Thanks in advance. I'm an undergrad studying biology.

ncbi blast genome sequence SNP • 4.8k views
ADD COMMENT
1
Entering edit mode

Previous reference for this : Aligning two DNA sequences with nBLAST

ADD REPLY
2
Entering edit mode
5.7 years ago
nsmi8446 ▴ 170

I think you want to create one database from one of the genomic sequences you have and blast the other different genomic sequence against it?

If this is the case, you would create one database like you already have (let's say your two sequence files are called genomic.sequences.1.fasta and genomic.sequences.2.fasta). We will use the genomic.sequences.1.fasta to build the database:

makeblastdb -in genomic.sequences.1.fasta -parse_seqids -dbtype nucl

Next, you would compare the other genomic sequence file (genomic.sequences.2.fasta) to the database you created above with the following:

blastn -query  genomic.sequences.2.fasta -db genomic.sequences.1.fasta -task blastn -outfmt 7 -max_target_seqs 10 -evalue 0.5 -perc_identity 95 > blast.out

Hope this is what you are after.

ADD COMMENT
0
Entering edit mode

This sounds like what I was going for. I'm going to try it out. Thank you!

ADD REPLY
0
Entering edit mode

The original files I had were in fasta format, but when I created databases from them the results were .nhr, .nin and .nsq file types. So which file type should I use when running the code you provided above? I've tried a few different options and keep getting errors.

ADD REPLY
0
Entering edit mode

You will need to use the original fasta format files to a) create the database b) as query in actual search with the index name of the blast database created in first step.

ADD REPLY
0
Entering edit mode

Sorry, I'm a newb, but what is the "index name"? Is that the name of the file location containing the database files? Where do I include the other sequence that I did not make a database from?

ADD REPLY
0
Entering edit mode

Index name would be something you provide by using -title option or it will be the default name of the file you made the index from.

 -title <String>
   Title for BLAST database
   Default = input file name provided to -in argument

Where do I include the other sequence that I did not make a database from?

In -query directive in your blastn command line.

ADD REPLY
2
Entering edit mode
19 months ago
doug ▴ 20

It's easier than this. You don't have to make a blast database.

 blastn -query seq1.fasta -subject seq2.fasta

This will work with multiple sequences in either or both files. Of course, for the subject data, it'll be faster if you're going to use it as a database to construct a blast db, for multiple runs, but if you just want to align two sequences, this is the way.

ADD COMMENT

Login before adding your answer.

Traffic: 2682 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6