Hello group,
I am trying to run tblastn of version 2.2.26+ for mapping peptides on eukaryotic genome, i am running the command
tblastn -query sequence.fa -out output.txt -subject ref_protein.fa -evalue 1e-5 -outfmt 7 -max_target_seqs 5 -best_hit_score_edge 0.05 -best_hit_overhang 0.25
but getting following error
Warning: lcl|Query_20167 contig20167: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options.
I used dustmasker also for filtering and masking and then run with masked genome, but still got the same error. I have also run by keeping filtering turn off with the following command
tblastn -query sequence.fa -out output.txt -F T -subject ref_protein.fa -evalue 1e-5 -outfmt 7 -max_target_seqs 5 -best_hit_score_edge 0.05 -best_hit_overhang 0.25
but got the following error
USAGE
tblastn [-h] [-help] [-import_search_strategy filename]
[-export_search_strategy filename] [-db database_name]
[-dbsize num_letters] [-gilist filename] [-seqidlist filename]
[-negative_gilist filename] [-entrez_query entrez_query]
[-db_soft_mask filtering_algorithm] [-db_hard_mask filtering_algorithm]
[-subject subject_input_file] [-subject_loc range] [-query input_file]
[-out output_file] [-evalue evalue] [-word_size int_value]
[-gapopen open_penalty] [-gapextend extend_penalty]
[-xdrop_ungap float_value] [-xdrop_gap float_value]
[-xdrop_gap_final float_value] [-searchsp int_value]
[-max_hsps_per_subject int_value] [-db_gencode int_value]
[-frame_shift_penalty frameshift] [-ungapped] [-max_intron_length length]
[-seg SEG_options] [-soft_masking soft_masking] [-matrix matrix_name]
[-threshold float_value] [-culling_limit int_value]
[-best_hit_overhang float_value] [-best_hit_score_edge float_value]
[-window_size int_value] [-lcase_masking] [-query_loc range]
[-parse_deflines] [-outfmt format] [-show_gis]
[-num_descriptions int_value] [-num_alignments int_value] [-html]
[-max_target_seqs num_sequences] [-num_threads int_value] [-remote]
[-comp_based_stats compo] [-use_sw_tback] [-in_pssm psi_chkpt_file]
[-version]
DESCRIPTION
Protein Query-Translated Subject BLAST 2.2.26+
Use '-help' to print detailed descriptions of command line arguments
========================================================================
Error: (CArgException::eInvalidArg) Unknown argument: "F"
Error: (CArgException::eInvalidArg) Application's initialization failed
Can anyone give advise on this?
Thanks,
as you can see, there is no '-F' option...
Guessing from your file naming, the query is nucl and the DB is protein. You might wanna try swapping the query and subject if that's the case, or use BLASTX/ explore other options.