Entering edit mode
3.9 years ago
rthapa
▴
90
Hi, I am using MUMmer plot to compare between a de novo assembly with reference genome. The percent identity between two genomes is more than 99% but when I plot the two genomes with MUMmer plot, the plot (https://ibb.co/q08xsZw) doesn't look like that. Does anyone have any idea? Thanks
Have you tried to reverse complement one of the sequences before trying this again? I don't recollect if MUMmer tries that.
No, I did't try the reverse complement. Do you have any suggestion on tools that we can use for getting reverse complement? The alignment looks like https://ibb.co/cL6ydYr with mauve.
You can use
reformat.sh
from BBMap suite to do the reverse complement.It looks like the reverse complement is better to visualize. It seems like there are two inversions in the assembled genome https://ibb.co/k5MqYqZ.
Is the red bar present in the right corner duplicated region in the reference genome? Do you have any idea if I need to get reverse complement to call the structural variants with Mummer or I can just use the de novo assembly?
Those blue segments represent inversions and translocations (which are apparent in your
mauve
plots too).Thank you. But when I check the mummer results for structural variation, there are four inversion in the assembly compared to reference genome. I wonder why mummer plot is showing only two inversions.
``
Hi, just saw your post and I know it's been quite too long since you started the discursion, but Mummer does actually show both four inversions you see in your plot. For example, in mauve output, seems like the reference sequence wasn't properly assembled. I'd guess that the red region at the beggining and at the end are overlaps, is this genome circular? If it is that, you can reverse complement that small region from any end (5'or 3') and align to the other, if that' true you can crop that region from the genome and the redo your analysis with your contigs.
I've seen some cases like that before, and that actually worked for me. The reason the tool reports 4 inverts its because that region is probably repeating in the reference sequence.
Best,
Carlos Costa