So this might sound like a trivial questions Is it possible to calculate the average genome size in a mixed dataset composed of complete (closed) genomes and assemblies? I have read that for assemblies, the only thing one can calculate is the assembly size which is just an approximation of the real genome. I've seen it in some papers, where they report average genome sizes of complete and draft genomes, but can't quite figure out how they do it (or if it is correct) Is there a particular definition of average genome size?
Hope this is the right lace to ask this type of Q.
Hi Vijay,
Thanks for your answer. I am talking about genomes retrieved from different databases such as NCBI, EBI or private databases at my research institution. This is a comparative-genomics oriented type of question id est: I want to compare the genome sizes of different ecotypes (in bacteria I forgot to mention). Your definition fits well.
Thanks for the explanation and the paper! I will explore the subject a bit.