Hello!
I am attempting to locate synteny between two recently released genomes for the parasitic nematode Haemonchus contortus. I expect pervasive synteny, as though the genomes correspond to divergent strains (one is an inbred laboratory strain, while the other is an African field isolate) they are nevertheless the same organism. To do this, I've tried to construct a syntenic dot plot.
One genome is 370 Mb, while the other is 320 Mb. My task is complicated by the fact that both genomes are of draft quality, meaning that I must compare the 26,000 contigs of one with the 14,400 contigs of the other, rather than the seven chromosomes of Haemonchus.
To construct the syntenic map, I've tried five programs. SyMAP, Mauve, and LAST's last-dotplot.py have all failed with one problem or another. SynMap (note it's distinct from SyMAP) and MUMmer both worked, but have produced rather peculiar output. While I have scant understanding of syntenic dotplots, the images produced (included at the end of this posting) seem to indicate terribly little synteny. I find this peculiar -- using the same tools, I've tried comparing C. briggsae and C. elegans, which demonstrate a great deal more synteny despite being (presumably) much more distant from each other than my two Haemonchus strains.
My questions are thus:
- What do the syntenic maps indicate? What exactly does the positioning of each dot in my dot plots demonstrate?
- Is it all biologically plausible that there is really so little synteny preserved between my two samples? As part of my analysis, I've also tried comparing the strains' respective proteomes using InParanoid. Though the two genomes bear a comparable number of annotated genes (21,800 in one, 23,600 in the other), I again saw much more divergence than I expected -- only 35% of the genes in each bore an orthologue in the other genome. The same analysis on C. briggsae and C. elegans found orthologues for 65% of the genes in each genome.
- Are my problems perhaps a result of my comparing tens-of-thousands of scaffolds in each genome against each other, rather than a small number of chromosomes? I've considered comparing only the 100 (or 1000) largest scaffolds from each genome to reduce the demands I'm making on my tools, but this would likely destroy whatever hope I have of making a valid comparison, given that these would encompass substantially different potions of the respective assemblies. The largest 100 scaffolds compose only 10% and 7% of the respective genomes, while the largest 1000 scaffolds compose only 46% and 36%.
I will much appreciate any help. Thanks!
SynMap yielded this:
MUMmer yielded this:
I think you need to filter so that you're only testing the largest contigs. If I understand correctly. Each square in that figure is a pair of chromosomes so the squares are too small to see any synteny.