Command Line BLAST Question
0
0
Entering edit mode
3.6 years ago
Randal • 0

I am utilizing this youtube video: to learn how to use command line BLAST to align my sequences. I am trying to use "blastp" rather than the "blastn" example in the video. I am using my own protein sequences because the files she mentioned in the video are no longer available. The issue is when I try to BLAST my sequences as she did in the video it states "Error: Too many positional arguments (1), the offending value:" What are you guys tips on creating a successful outcome as she did in the video.

Thank you in advance

phylogenetic line BLAST command • 1.3k views
ADD COMMENT
0
Entering edit mode

Please include the actual command that you used so that we can point out the mistake..

ADD REPLY
0
Entering edit mode

Here is the command:

randalburks@Randals-MacBook-Air ~ % blastp
BLAST query/options error: Either a BLAST database or subject sequence(s) must be specified
Please refer to the BLAST+ user manual.
randalburks@Randals-MacBook-Air ~ % cd Desktop/Chicken/
randalburks@Randals-MacBook-Air Chicken % ls -F
ARG_06_01.fasta     HK_gene_06_01.fasta
randalburks@Randals-MacBook-Air Chicken % head ARG_06_01.fasta
>gi|34101385|gb|AAQ57754.1| DNA topoisomerase IV subunit A [Chromobacterium violaceum ATCC 12472]
MSDHDLFDSPAPAGDLTPPPPAAPEAASGDWIPLDLYAERAYLEYAMSVVKGRALPEVADGQKPVQRRIL
YAMRDMGLVHGAKPVKSARVVGEILGKYHPHGDSSAYEALVRMAQDFTLRYPLIDGQGNFGSRDGDGAAA
MRYTEARLTPIAELLLSEIDMGTVDFVPNYDGAFEEPSLLPARLPMVLLNGASGIAVGLATEIPPHNLTE
VAAACVALLDDPELDTAALMQYVPGPDFPGGGQIITPVADILSAYETGKGSVRVRAKWEVEKLARGQWRA
IVTELPPGSSAQKVLAEIEEATNPKLKAGKKQLSQDQQNLKKLMLDLLDRVRDESDSESPVRLVFEPKSS
RQDPDEFMNILLAQTSLEGNASLNLVMIGLDGRPGQKGLKPILTEWIDFRRTTVTRRLAHRLAQVDKRIH
ILEGRMIAFIHIDEVIRVIRESDEPKPDLIKAFGLTEIQAEDILEIRLRQLARLEGFKLEKELSELREER
EGLRHVLDTPDALTRLIRDEIQADAAKYGDKRRTEIKAAERAVLTQTTADEPVTLILSQKGWIRARVGHN
VELETLSFKDGDGLAAVVETRTVWNAIVLDHNGRAYTIDPATVPTGRGDGVPVASLVDLQDGAKPAQLIS
randalburks@Randals-MacBook-Air Chicken % tail ARG_06_01.fasta
ITLIPTPPVRGEIVDVNGVPLAKNYPVFSLEVIPSRIEGKMEDVIEALRKYVDITPTDLKRFHKYRESYR
KFENIPLKLRLTDEEAAKLSVHLNEFKGVEVNSRTFREYPYGKLTSHFLGYIGRISDKDQEILEEEGLTA
LYRGSTHIGKSGLEKFYEPQLHGAPGYQEVEKDAYGNIVRVLKNVPSRMGQTLRLGMDIRMQQEADKILG
DRRGAIVAINPQDGTVLAFVSKPSFDPNLFIDGIDSETWKALNDDWKKPLVNRVTQGLYPPGSTFKPFMG
MALLESGKITQTTVVPAPGAWSIPGSRHVFRDSVRSGHGSANLSKAIQVSSDTFFYRLGYEMGIDKASPY
LAQFGFGKKTGIDLPSEYTGVLPSREWKAKRFAKSSDPTAKEWRAGEMVSVSIGQGYNAYTPLQMAHATA
SLANNGVVYQPHLVKELLDFDKRKITRINPNPEYKIPFKQDNFEYVKQAMEKVLKPGGTAHRIGGGLAYS
MGGKTGTAQVVQIKQGGRYNAAALREQHRDHAWFISFAPLEKPEIAIAVILENGGWGAYAAPLARSLTDF
YMLNVKPQQFSDDPDAGGMPEPQQHRTQPQPPTIFQKAYGLKQEEANHE

randalburks@Randals-MacBook-Air Chicken % awk 'NR==1, NR==20' ARG_06_01.fastsa
awk: can't open file ARG_06_01.fastsa
 source line number 1
randalburks@Randals-MacBook-Air Chicken % awk 'NR==1, NR==20' ARG_06_01.fasta 
>gi|34101385|gb|AAQ57754.1| DNA topoisomerase IV subunit A [Chromobacterium violaceum ATCC 12472]
MSDHDLFDSPAPAGDLTPPPPAAPEAASGDWIPLDLYAERAYLEYAMSVVKGRALPEVADGQKPVQRRIL
YAMRDMGLVHGAKPVKSARVVGEILGKYHPHGDSSAYEALVRMAQDFTLRYPLIDGQGNFGSRDGDGAAA
MRYTEARLTPIAELLLSEIDMGTVDFVPNYDGAFEEPSLLPARLPMVLLNGASGIAVGLATEIPPHNLTE
VAAACVALLDDPELDTAALMQYVPGPDFPGGGQIITPVADILSAYETGKGSVRVRAKWEVEKLARGQWRA
IVTELPPGSSAQKVLAEIEEATNPKLKAGKKQLSQDQQNLKKLMLDLLDRVRDESDSESPVRLVFEPKSS
RQDPDEFMNILLAQTSLEGNASLNLVMIGLDGRPGQKGLKPILTEWIDFRRTTVTRRLAHRLAQVDKRIH
ILEGRMIAFIHIDEVIRVIRESDEPKPDLIKAFGLTEIQAEDILEIRLRQLARLEGFKLEKELSELREER
EGLRHVLDTPDALTRLIRDEIQADAAKYGDKRRTEIKAAERAVLTQTTADEPVTLILSQKGWIRARVGHN
VELETLSFKDGDGLAAVVETRTVWNAIVLDHNGRAYTIDPATVPTGRGDGVPVASLVDLQDGAKPAQLIS
GRDEDRFVVAGSGGYGFIAKIGDMAGRVKAGKAFITLDASETVLEPVKLPAAPLEQLQLVAASDSDRLLA
FPAAELKELAKGRGLMLMALDDGAALTAIGLVAGGKALLSTTSVRGKEAEEKLALEEFAGKRAKKGKLMP
KKWRVSRIRELPM

>gi|34103609|gb|AAQ59970.1| DNA gyrase, subunit A [Chromobacterium violaceum ATCC 12472]
MTDNLFAKETLPISLEEEMRRSYLDYAMSVIVGRALPDVRDGLKPVHRRVLYAMHELSNDWNRAYKKSAR
IVGDVIGKYHPHGDTAVYDTIVRLAQDFSLRYPLVDGQGNFGSVDGDNAAAMRYTEIRMARIAHELLADI
EKETVDFGPNYDGSEHEPLILPAKIPNLLVNGSSGIAVGMATNIPPHNLNEVVDACLALLANSDLTIDEL
IDIIPAPDFPTAGIIYGTAGVKEGYRTGRGRVIMRARTHTEPIGKGDREAIIVDEIPYQVNKARLLERIS
ELVRDKQIEGISDLRDESDKSGMRVVIELKRGEMPEVVLNHLFKMTQLQDSFGMNMVALVDGQPRLLNLK
randalburks@Randals-MacBook-Air Chicken % awk 'NR==1, NR==20' ARG_06_01.fasta > my_seqs_f
randalburks@Randals-MacBook-Air Chicken % ls -F
ARG_06_01.fasta     HK_gene_06_01.fasta my_seqs_f
randalburks@Randals-MacBook-Air Chicken % blastp -query my_seqs_f -db Chromobacterium violaceum ATCC 12472 -out my_seqs_blast.txt 
USAGE
  blastp [-h] [-help] [-import_search_strategy filename]
    [-export_search_strategy filename] [-task task_name] [-db database_name]
    [-dbsize num_letters] [-gilist filename] [-seqidlist filename]
    [-negative_gilist filename] [-negative_seqidlist filename]
    [-taxids taxids] [-negative_taxids taxids] [-taxidlist filename]
    [-negative_taxidlist filename] [-ipglist filename]
    [-negative_ipglist filename] [-entrez_query entrez_query]
    [-db_soft_mask filtering_algorithm] [-db_hard_mask filtering_algorithm]
    [-subject subject_input_file] [-subject_loc range] [-query input_file]
    [-out output_file] [-evalue evalue] [-word_size int_value]
    [-gapopen open_penalty] [-gapextend extend_penalty]
    [-qcov_hsp_perc float_value] [-max_hsps int_value]
    [-xdrop_ungap float_value] [-xdrop_gap float_value]
    [-xdrop_gap_final float_value] [-searchsp int_value] [-seg SEG_options]
    [-soft_masking soft_masking] [-matrix matrix_name]
    [-threshold float_value] [-culling_limit int_value]
    [-best_hit_overhang float_value] [-best_hit_score_edge float_value]
    [-subject_besthit] [-window_size int_value] [-lcase_masking]
    [-query_loc range] [-parse_deflines] [-outfmt format] [-show_gis]
    [-num_descriptions int_value] [-num_alignments int_value]
    [-line_length line_length] [-html] [-sorthits sort_hits]
    [-sorthsps sort_hsps] [-max_target_seqs num_sequences]
    [-num_threads int_value] [-ungapped] [-remote] [-comp_based_stats compo]
    [-use_sw_tback] [-version]

DESCRIPTION
   Protein-Protein BLAST 2.11.0+

Use '-help' to print detailed descriptions of command line arguments
========================================================================

Error: Too many positional arguments (1), the offending value: violaceum
Error:  (CArgException::eSynopsis) Too many positional arguments (1), the offending value: violaceum
randalburks@Randals-MacBook-Air Chicken % 
ADD REPLY
0
Entering edit mode
blastp -query my_seqs_f -db Chromobacterium violaceum ATCC 12472 -out my_seqs_blast.txt

-db should be followed by a valid database name. You can see from the error report that your problem lies here: the offending value: violaceum

Also we can see that you don't have a database with that name at least in the dir you are in. Have you setup a blastdb dir and is blast aware of it? I have a feeling that you haven't. Anyway if you have and that is a valid database name, then wrap it in single quotes:

-db 'Chromobacterium violaceum ATCC 12472'

Generally speaking there's little to be gained by creating a blast database consisting of just one bacterial genome. Instead of -db you would use -subject whateverFile.fasta

ADD REPLY
0
Entering edit mode

Thank you for your input I will definitely take that into account. My other question is , what should the database file consist of?

ADD REPLY
0
Entering edit mode

You make blast databases from fasta files with the makeblastdb tool..

ADD REPLY

Login before adding your answer.

Traffic: 2197 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6