Hi, community!
I need to work with a mock metagenome community for the optimization of the alignment parameters of a tool
I've found this community: https://www.nature.com/articles/sdata201681, that is composed of 26 bacterial and archaeal genomes.
For the purpose of my work I need to know the abundance of some genes in the sample. For example, if the genome of E.coli have the GENE_A one time, I need to know how many times the E.coli genome there is in the sample.
The table of the paper (https://media.nature.com/original/nature-assets/sdata/2016/sdata201681/extref/sdata201681-s1.pdf) says something like that:
Genome Molarity Genome copies per ul
E.coli xxx 150
This means that the genome of E.coli in the metagenome sample is present 150 times? or exist some way to know the genome copies in the metagenome?
Thank you so much!
no real answer, but a different dataset http://mockrobiota.caporasolab.us/
I think taxonomic classification will be the best way to find out abundance of an organism in metagenomic samples.
You can you have a look on kaiju in this case.
Thanks for your Answer Nitin. The problem is that I don't want to find out the abundance of an organism, I want to count the abundance of some genes of interest. For example, how many time the gene "nifH" is present in the sample, for this purpose I need to know the organism that carry this gene, how many time this gene exist in each genome and the number of genome copies present in the sample.