Entering edit mode
7.3 years ago
bio90029
▴
10
Hi, I have performed a blast following the below code:
blastn_cline=NcbiblastnCommandline(query='CDS_extracted_file.fa',db=output + '/temporary_db',evalue=0.001, gapopen=0,gapextend=2,outfmt=5,out=output +'/blast_file.xml')
I have released that some of the genes are repeating as for example:
gene_690 ['phage transcriptional regulator, AlpA'] gnl|BL_ORD_ID|97 NODE_98_length_17475_cov_21.699114 number_gaps: 0 per_identity: 93.6329588015
gene_690 ['phage transcriptional regulator, AlpA'] gnl|BL_ORD_ID|97 NODE_98_length_17475_cov_21.699114 number_gaps: 0 per_identity: 91.7602996255
gene_690 ['phage transcriptional regulator, AlpA'] gnl|BL_ORD_ID|86 NODE_87_length_35482_cov_21.696409 number_gaps: 0 per_identity: 93.
How I can modify the script so I only get the best one? Thanks
"max_hsps" option may work if your wrapper has that option. https://www.ncbi.nlm.nih.gov/books/NBK279675/