Hello all,
I have merged two large (~2.6 gb) genome assemblies from the same organism into one genome assembly using the GAA tool. I am looking for a way to determine how correct this new merged assembly is and to identify mis-assembled contigs/scaffolds. I have the contigs/scaffolds for all three assemblies now, as well as the raw reads from one of the original assemblies.
One idea could be to map the raw reads back to the new merged assembly and see how well this mapping is. However, I am not sure what mis-assembled regions would look like, i.e. low or high coverage?
Another idea is to align the contigs/scaffolds from the two original assemblies to the merged assemblies. I am currently using nucmer to do this and can visualize the alignments by mummerplot and in ACT. Two issues I am having, though, are what parameters to use in the alignment and how to automate this process.
Any help would be appreciated, thank you!
That is likely not a last check. Checking if the cDNAs are present in the draft assembly is one of the first QC steps.