Number Of Gene Cluster From Blast Output
1
0
Entering edit mode
11.1 years ago
armahmud3 • 0

After parsing genbank file and blast output(xml file) using biopython library, how can I calculate the number of linked gene cluster at different stringent level (0 to 100) increment of 10. Example

0 (stringent level),13456 (number of gene cluster) 10,234 20,234 30,200 40,190 50,187 60,187 70,100 80,95 90,55 100,45

blast • 2.4k views
ADD COMMENT
1
Entering edit mode

BLAST is not a "sensu stricto" clustering tool but a pairwise aligner, so: how do you define a cluster in your case ? Something generated with OrthoMCL or BlastClust ?

ADD REPLY
0
Entering edit mode

I define it as homologous gene groups.

ADD REPLY
0
Entering edit mode
11.1 years ago
Pierre ▴ 130

Not sure if I understand your question correctly, however, in case that you want to cluster / group your sequences based on pairwise sequence similarity, then you can use blastclust (already includede in blast+ package) to do this.

ADD COMMENT
0
Entering edit mode

Thank you pierre. I, actually need the python script.

ADD REPLY
0
Entering edit mode

In principle, blastclust would give you clusters of homologous sequences (presumably paralogs). You just need to have a fasta file containing all sequences that you'd like to have as clustered.

ADD REPLY

Login before adding your answer.

Traffic: 2394 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6