stand alone blast display header in output file (customizing -outfmt 6)
3
0
Entering edit mode
8.2 years ago
BCArg ▴ 90

Dear all,

I am using standalone blast to blast a few hundred proteins against a custom database that I created (leido). I want a tabular output (displaying only the best hit) and the option -outfmt 6 is the best tested. I am using the following command to run it (pipping the output to gedit):

blastp -query subset.fasta -db leido -outfmt 6 -max_target_seqs 1  |gedit -

the output that I get (for this example, I am only submitting 3 proteins test wise (subset.fasta) follows and I know I am missing a few columns here - probably due to pasting issues in this text dialog, all columns appear on my gedit output though):

LmjF.36.6910    LdBPK_367240.1  98.88   537 6   0   1   537 1   537 0.0 1089
LmjF.36.6650    LdBPK_366960.1  96.75   553 18  0   1   553 1   553 0.0 1082
LmjF.36.5620    LdBPK_365870.1  96.77   1082    35  0   18  1099    19  1100

Is there a way to pass any flag to display the headers of the columns displayed in my output?

I tried already specifying the columns I want with (although I do not know whether or not this option should display the headers):

blastp -query subset.fasta -db leido -outfmt 6 "qseqid sseqid pident qlen length mismatch gapope evalue bitscore" -max_target_seqs 1 |gedit -

and I get the following error:

Error: Too many positional arguments (1), the offending value: qseqid sseqid pident qlen length mismatch  apope evalue bitscore.

I can easily add the columns names in R, though I would like to explore all the standalone blast functionalities

Thanks

blast R • 7.1k views
ADD COMMENT
4
Entering edit mode
8.2 years ago
blastp -query subset.fasta -db leido -outfmt "7 qseqid sseqid pident qlen length mismatch gapopen evalue bitscore" -max_target_seqs 1

This will give you:

# BLASTP 2.2.28+
# Query: AK364823
# Database: leido
# Fields: query id, subject id, % identity, query length, alignment length, mismatches, gap opens, evalue, bit score
# 1 hits found
AK364823    AT1G06390.2 91.23   285 285 25  0   0.0  563
# BLASTP 2.2.28+
# Query: AT1G09840.1
# Database: leido
# Fields: query id, subject id, % identity, query length, alignment length, mismatches, gap opens, evalue, bit score
# 1 hits found
AT1G09840.1 AT1G09840.6 100.00  285 285 0   0   0.0  600
# BLAST processed 2 queries
ADD COMMENT
1
Entering edit mode

I'm actually interested in the -outfmt 6 format. But yes, I have to include the 6 inside the quotes. Blastp did not complain this time, though no header was displayed

ADD REPLY
0
Entering edit mode

Output format 7 (outfmt 7) goes with the header. Format 6 is the same, but without the header.

ADD REPLY
0
Entering edit mode

Output format 6 = tabular while output format 7 = tabular with comment lines so the difference is not only the header.

ADD REPLY
1
Entering edit mode
8.2 years ago
blastp -query subset.fasta -db leido -outfmt  "7 qseqid sseqid pident qlen length mismatch gapope evalue bitscore"
ADD COMMENT
0
Entering edit mode

I'm actually interested in the -outfmt 6 format. But yes, I have to include the 6 inside the quotes. Blastp did not complain this time, though no header was displayed

ADD REPLY
0
Entering edit mode
2.2 years ago
al-ash ▴ 210

I think there is no option to add header to outformat 6. Ugly workaround:

cat <(printf "qseqid\tsseqid\tpident\tqlen\tlength\tmismatch\tgapope\tevalue\tbitscoren") blast_output_file > blast_output_file_header
ADD COMMENT

Login before adding your answer.

Traffic: 2541 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6