how to remove the repeat genes or get the best hsp mathces
0
0
Entering edit mode
7.3 years ago
bio90029 ▴ 10

Hi, I have performed a blast following the below code:

blastn_cline=NcbiblastnCommandline(query='CDS_extracted_file.fa',db=output + '/temporary_db',evalue=0.001, gapopen=0,gapextend=2,outfmt=5,out=output +'/blast_file.xml')

I have released that some of the genes are repeating as for example:

gene_690 ['phage transcriptional regulator, AlpA']  gnl|BL_ORD_ID|97 NODE_98_length_17475_cov_21.699114 number_gaps: 0  per_identity: 93.6329588015
gene_690 ['phage transcriptional regulator, AlpA']  gnl|BL_ORD_ID|97 NODE_98_length_17475_cov_21.699114 number_gaps: 0  per_identity: 91.7602996255
gene_690 ['phage transcriptional regulator, AlpA']  gnl|BL_ORD_ID|86 NODE_87_length_35482_cov_21.696409 number_gaps: 0  per_identity: 93.

How I can modify the script so I only get the best one? Thanks

blast gene • 1.3k views
ADD COMMENT
1
Entering edit mode

"max_hsps" option may work if your wrapper has that option. https://www.ncbi.nlm.nih.gov/books/NBK279675/

ADD REPLY

Login before adding your answer.

Traffic: 1998 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6