Question

comparison of species profiling simply using 16s rRNA in 16s rRNA amplicon seqeuncing and shotgun metagenome data

1

Entering edit mode

9.3 years ago

concer.guo ▴ 10

Dear all,

I am new to metagenomics. My question might seem to be naive. As the title implies, how different are the species abundance simply using 16S rRNA data in 16S rRNA amplicon sequencing data and shotgun metagenome data (by selecting 16S rRNA reads) based on current technology?

I came across one paper titled "Comparing bacterial communities inferred from 16S rRNA gene sequencing and shotgun metagenomics". It showed that significant different result from those two different data types and that experimental method might be the possible reason for this difference. This paper was published several years ago. I know that there are many tools to profiling species, like QIIME, mothur, mOTU, and metaphlan based on different data types. I want to know if current technology could provide comparable result in species abundance profiling simply using 16S rRNA reads from these two data types? If not, what are the possible reason? and is there other ways to supplement?

Much appreciated.

metagenomics species-profiling • 6.3k views

ADD COMMENT • link updated 2.1 years ago by Ram 44k • written 9.3 years ago by concer.guo ▴ 10

0

Entering edit mode

Thank you for your explanation. Indeed, I didn't understand the problem very well. I didn't consider all those other factors.

For the capital S for 16S, thanks for this. I used lower case just for convenience, I'll change that.

Yes, different primers might produce different taxa compositions. mOTU use 10 marker genes to calculate the taxa composition from metagenomes. So I was wondering the possibility of using 16S rRNA reads to supplement/correct the species abundance computed from other marker genes in metagenomes (such as the case of mOTU). Do you think this might be plausible? Or this is just naive extension of my first question?

ADD REPLY • link updated 2.1 years ago by Ram 44k • written 9.3 years ago by concer.guo ▴ 10

0

Entering edit mode

No problem, it's my pleasure. Just letting you know about S as being a Svedberg unit -- using the correct terminology will show that you understand concepts to other people.

I'm unsure what you mean when you use "mOTU" here, which typically means "molecular operational taxonomic unit" which is arbitrarily designated based on molecular data. There are ways to use different marker regions, but how do you definitively know they come from the same individual "species"? (whatever "species" means in a microbial sense is another big question!)

Do you think this might be plausible? or this is just naive extension of my first question?

It is possible to pull phylogenetically informative markers (16S, rpoB, etc.) from metagenomic data, but how do you connect one marker to another marker and show that they are from the same organism? People are working on how best to do this, but it's not a trivial problem.

ADD REPLY • link updated 5.1 years ago by Ram 44k • written 9.3 years ago by Josh Herr 5.8k

0

Entering edit mode

Sorry, I didn't make it clear. There is a species profiling tool called mOTU (http://www.nature.com/nmeth/journal/v10/n12/full/nmeth.2693.html) from Peer Bork group in EMBL. It use 10 universal marker genes to compute the taxa composition. To accurately define species for microbes is indeed a challenge.

It's not easy to connect one marker to another marker. I didn't realize that. I just think this might be a possible way to complement species with traditional 16S rRNA method.

You mentioned many other informative markers like 16S, rpoB. I read from one recent paper using chaperone Cpn60 to do species profiling. Their result seems to be better than 16S rRNA methods. And Cpn60 is mostly universal in bacteria and archaea, and a tool called mPUMA (http://www.microbiomejournal.com/content/1/1/23) was also developed to calculate species diversity from cpn60-based amplicon sequencing data. As I wanted to work on this topic to find ways to produce more accurate microbial taxa structure, so I brought the question at the beginning, but I didn't understand the situation well.

ADD REPLY • link updated 2.1 years ago by Ram 44k • written 9.3 years ago by concer.guo ▴ 10

0

Entering edit mode

Thanks for the information -- would you mind posting the references (links?) about the mOTU tool and the use of the chaperone Cpn60 for species profiling? I'm not familiar with either of these. This may help others who have the same question you do with regards to this.

Thanks again for your questions!

ADD REPLY • link updated 2.1 years ago by Ram 44k • written 9.3 years ago by Josh Herr 5.8k

0

Entering edit mode

yes, I just add the link. and also thanks for your reply!

ADD REPLY • link 9.3 years ago by concer.guo ▴ 10

0

Entering edit mode

Thank you! I'm a little clearer on your question now. I was unaware of these tools.

ADD REPLY • link updated 2.1 years ago by Ram 44k • written 9.3 years ago by Josh Herr 5.8k

0

Entering edit mode

My bad, I didn't make it clear and also for lack of knowledge. Anyway, it's good to see your helpful explanation

ADD REPLY • link 9.3 years ago by concer.guo ▴ 10

0

Entering edit mode

Hi, I am also thinking about this problem. Since my shotgun result is much much different from 16S. I am confused. May I know how much does your profiling result different from shotgun and 16S? Thanks.

ADD REPLY • link 6.6 years ago by luyang1005 ▴ 20

Ram · Accepted Answer · 2015-07-28

I'm not quite sure I fully understand your question -- maybe this is not obvious to you, but using markers for selective sequencing and random whole genome shotgun sequencing will not inherently capture the same sequences (or sequence diversity). One is specific (16S) and the other is random (metagenomic sequencing) -- they might have some overlap of signal with regards to 16S, but this only comes (from my experience) with great metagenomic sequencing depth AND accurate and strain unbiased assembly.

how different are the species abundance simply using 16s rrna data in 16s rrna amplicon sequencing data and shotgun metagenome data (by selecting 16s rrna reads) based on current technology?

They are different -- this depends on a lot of factors: how complex a system you are studying, how deeply your sequence your environmental sample, etc. etc.

I want to know if current technology could provide comparable result in species abundance profiling simply using 16s rrna reads from these two data types?

It's possible to compare amplicons to metagenomic data, but as mentioned above this depends on the complexity of the environmental sample and the depth to which you sequence. How you interpret your data analysis is also a factor here. Obviously, there will not be a 1 to 1 ration of your 16S amplicon reads to a corresponding metagenomic 16S reads on the basis of the type of sequencing, so the answer is they are not directly comparable, but observing both might provide you with information regarding your taxonomy. 16S primers are not universal, so by looking at metagenomic data you might observe taxa not amplified by 16S primers -- does this make sense?

If not, what are the possible reason? and is there other ways to supplement?

I don't understand the "possible reason" part -- you're comparing a PCR amplified marker sequence with a random sample from a genome -- yes, you might have an overlap, but it's unlikely, especially with a metagenomic sample. It's this reason that it's hard to compare amplicon sequencing with metagenomic sequencing -- more depth increases the chances you will have overlapping information. You can supplement by choosing other marker regions other than 16S, but they all will have these inherent problems.

Also, by the way, this is a pet peeve of mine, it's 16S, the 'S' is capital and stands for Svedberg.