Entering edit mode
6.0 years ago
Hello,
I'm working on a final project for my genomics course. I'd like to analyze the genomes for a few species of Borrelia. However, the assemblies of the genomes seem to be very inconsistent. For example, for Borrelia afzelii, one assembly has 6 chromosomes and plasmids, and another has 10. Another assembly has only one chromosome. Their N50 values are also about the same. So I have no way of knowing which one to choose.
In general, what rules of thumb should I be following when deciding which assembly to use from the NCBI RefSeq database?
Thank You for the help
You can consider filtering for latest RefSeq assembly and in the category of either representative genome or reference genome. For example, Borrelia does not have a reference genome but it does have a representative genome that you can search in NCBI Assembly using the following query:
"Borrelia afzelii"[Organism] AND latest_refseq[filter] AND representative_genome[filter]
Thank You. But is it safe to assume that the newest assembly is the most accurate one? I'd actually like to explore what sort of genes are on the plasmid, but some assemblies only show one chromosome. So should I go for another assembly?