need advice for comparing fragment counts of genes across multiple WGS metagenomes
0
0
Entering edit mode
3.2 years ago
sapuizait ▴ 10

Hi everyone

I have a couple of hundreds metagenomes from multiple environments and I am interested in comparing the fragment counts of a single (bacterial) gene in order to see which metagenomes have higher vs lower abundances. I do not care about comparing the fragment counts of different genes within the same sample or between samples. My comparisons will be one gene at a time.

That being said, I am trying to figure out the best way to normalize the metagenomes. Considering that I will always compare one gene at a time, I feel that normalizing by gene length (e.g. FPKM) is irrelevant. However, I would imagine that my main focus should be (fragment counts of gene of interest)/(total bacterial counts) or sth similar

I have seen in some papers that people use the 16S counts as total bacterial counts but I feel this is wrong; the 16S copies within each species may vary and therefore the bacterial composition across samples can significantly affect this value. As an alternative I was thinking sth like mapping the fastq reads to NR in order to figure out the number of bacterial reads in each sample

I have never done that before and any advice would be very welcome. I apologize for any ignorant questions or misconceptions, just trying to figure out what is the best strategy. I also see in many forums people are very angry at FPKM... which I think it could work just fine for me, especially if I cared about length...

Thanks in advance

WGS normalization KMA • 481 views
ADD COMMENT

Login before adding your answer.

Traffic: 1152 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6