Question

Can I align two DNA sequences using BLAST from command line?

0

Entering edit mode

5.7 years ago

sj93 ▴ 10

I've created two custom databases using blastn from the command line in Ubuntu. Each database is a different genomic sequence of the same species. The output for these custom databases each contained 3 files, .nhr, .nin and .nsq. The NCBI BLAST help manual says the command for running blast is

blastn –db nt –query nt.fsa –out results.out

I've tried researching this and playing around with the above command by using the file names, locations, etc. that apply to the files I'm trying to blast, but I cannot figure out what the appropriate version of this command is for my task. Any tips? Thanks in advance. I'm an undergrad studying biology.

ncbi blast genome sequence SNP • 4.8k views

ADD COMMENT • link updated 19 months ago by doug ▴ 20 • written 5.7 years ago by sj93 ▴ 10

1

Entering edit mode

Previous reference for this : Aligning two DNA sequences with nBLAST

ADD REPLY • link 5.7 years ago by GenoMax 147k

h.mon · Answer 1 · 2019-03-06

2

Entering edit mode

5.7 years ago

nsmi8446 ▴ 170

I think you want to create one database from one of the genomic sequences you have and blast the other different genomic sequence against it?

If this is the case, you would create one database like you already have (let's say your two sequence files are called genomic.sequences.1.fasta and genomic.sequences.2.fasta). We will use the genomic.sequences.1.fasta to build the database:

makeblastdb -in genomic.sequences.1.fasta -parse_seqids -dbtype nucl

Next, you would compare the other genomic sequence file (genomic.sequences.2.fasta) to the database you created above with the following:

blastn -query  genomic.sequences.2.fasta -db genomic.sequences.1.fasta -task blastn -outfmt 7 -max_target_seqs 10 -evalue 0.5 -perc_identity 95 > blast.out

Hope this is what you are after.

ADD COMMENT • link updated 5.7 years ago by h.mon 35k • written 5.7 years ago by nsmi8446 ▴ 170

0

Entering edit mode

This sounds like what I was going for. I'm going to try it out. Thank you!

ADD REPLY • link 5.7 years ago by sj93 ▴ 10

0

Entering edit mode

The original files I had were in fasta format, but when I created databases from them the results were .nhr, .nin and .nsq file types. So which file type should I use when running the code you provided above? I've tried a few different options and keep getting errors.

ADD REPLY • link 5.7 years ago by sj93 ▴ 10

0

Entering edit mode

You will need to use the original fasta format files to a) create the database b) as query in actual search with the index name of the blast database created in first step.

ADD REPLY • link 5.7 years ago by GenoMax 147k

0

Entering edit mode

Sorry, I'm a newb, but what is the "index name"? Is that the name of the file location containing the database files? Where do I include the other sequence that I did not make a database from?

ADD REPLY • link 5.7 years ago by sj93 ▴ 10

0

Entering edit mode

Index name would be something you provide by using -title option or it will be the default name of the file you made the index from.

 -title <String>
   Title for BLAST database
   Default = input file name provided to -in argument

Where do I include the other sequence that I did not make a database from?

In -query directive in your blastn command line.

ADD REPLY • link 5.7 years ago by GenoMax 147k

score 2 · Answer 2 · 2023-04-27

It's easier than this. You don't have to make a blast database.

 blastn -query seq1.fasta -subject seq2.fasta

This will work with multiple sequences in either or both files. Of course, for the subject data, it'll be faster if you're going to use it as a database to construct a blast db, for multiple runs, but if you just want to align two sequences, this is the way.