I performed a BUSCO genome mode analysis using the insecta_odb10 dataset.
The results were as follows: C: 99.8% [S: 91.1%, D: 8.7%], F: 0.1%, M: 0.1%, n: 1367, E: 3.7%.
Can someone help me interpret what "E: 3.7%" stands for?
I would really appreciate any help.
BUSCO Image tag version: ezlabgva/busco:v5.7.0_cv1 and ezlabgva/busco:v5.7.1_cv1
Code: nohup busco -i genome.fasta --auto-lineage-euk -o /path/to/outdir -m genome -c 80 &
According to ChatGPT:
In BUSCO, the "E" category represents "End" or "Endof" gene fragments. These are orthologous groups for which the gene model is fragmented at the end of the sequence, indicating that the sequence is incomplete at the 3' or 5' end.
funny enough i tried the same thing (i've used BUSCO heaps times but never saw the E:! even the manual lists 'C:89.0%[S:85.8%,D:3.2%],F:6.9%,M:4.1%,n:3023', without the E:) and ChatGPT waffled something about E: being the percentage of eukaryotic genes in the proteome. One of us is wrong, or both :)