Entering edit mode
6.4 years ago
prasundutta87
▴
670
Hi,
What is the best way for a sanity check to see if the genomic coordinates have changed in a new genome assembly? I have tried two methods:
1) Genome-Genome dot plots (we should get a straight line)
2) Checking chromosome lengths from their corresponding .fai or index file. (they should be same)
Is it that simple or there are some pit-falls I should look into?
Unless you are talking about a "new" assembly with the same dataset and the same program as the old assembly, I think it is safe to assume the coordinates will change in a new assembly. The question is how to map between them.
The thing is the chromosome ids were changed and some unplaced scaffolds were removed during the annotation process of a newly submitted genome. I was using the genome before it was annotated for my downstream analysis such as alignment and WGS variant calling. I wanted to look into if there were any coordinate changes that would affect my downstream analysis or only changing the chromosome ids will solve the problem.
Is the genome from NCBI? Which genome is it? In general, there is pretty detailed description of changes between assemblies in NCBI.
Yes..its the water buffalo (bubalus bubalis) genome assembly present in Refseq genome FTP. It was very recently annotated.