investigate the inversions in comparison of assembled genome with reference genome
0
0
Entering edit mode
15 months ago
rj.rezwan ▴ 10

Hi, I made scaffolds using ragtag with query assembled_genome.fasta and the reference genome which is already published. So now I made the dot plot for the comparison of two genomes. The reference genome has low BUSCO value than our assembled genome. I have attached the dot plot here. Please let me know how to checkenter image description here that these inversions are real? Or they are artifact from ragtag scaffolding?

genome assembly scaffold ragtag dotplot • 1.1k views
ADD COMMENT
0
Entering edit mode

Most likely ragtag would not scaffold a region in the opposite orientation by mistake so most likely the assembly contains an inversion. However if your new scaffolded assembly is made using short reads, I would be careful about accepting the structure as truth. A quick thing to do is to break up the scaffolds whilst maintaining the order and then regenerate the dotplot so you can easily see the defined contig edges and the inverted region.

ADD REPLY
0
Entering edit mode

contigs are assembled using the hifi long reads. is there a chance may be that repeat region in the genome may cause ragtag to make the missing or inversion in that portion. Becasue as I have mentioned that BUSCO of the reference is quite less which is 93% and our assembled genome has the BUSCO 97%. So may be there is a complexity while comparison and may cause missingness and inversions because majority of the inversions are after the scaffold breaks which may make sense that the comparison genome is not appropriate to be used as a reference here.

ADD REPLY
0
Entering edit mode

In my experience, a lower BUSCO is not generally impacted by repeat regions so I don't think that is related. Usually in terms of Long-read assemblies it is due to accuracy (and requires polishing) or some contigs have been removed after assembly.

I am not sure I understand "majority of inversions are after the scaffold breaks", however, as I suggested before, I think you should evaluate if you have assembled contigs that contain alignment to the reference in both senses, therefore capturing at least one edge of the inversion. You can also see if this is captured in the reads.

ADD REPLY

Login before adding your answer.

Traffic: 2574 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6