Greengene : what is the "99" in OTU description?
2
0
Entering edit mode
7.6 years ago
sacha ★ 2.4k

Hi,

From greengene database, I don't understand the meaning of "99" with OTU description. For exemple, in the stat file, I have the following lines :

Total number of OTUs:
$ grep -c "^>" *.fasta
61_otus.fasta:22
64_otus.fasta:33
67_otus.fasta:53
70_otus.fasta:125
73_otus.fasta:267
76_otus.fasta:554
79_otus.fasta:1165
82_otus.fasta:2496
85_otus.fasta:5088
88_otus.fasta:10544
91_otus.fasta:22090
94_otus.fasta:46256
97_otus.fasta:99322
99_otus.fasta:203452

Same with the file

  gg_13_5_otus_99_annotated.tree.gz .

So, what does the number 99 mean ? Thanks !

greengenge otu taxonomy • 4.6k views
ADD COMMENT
1
Entering edit mode
7.6 years ago

My guess would be that means 99% sequence identity. As far as I know, 97% and 99% are two fairly common sequence identity cutoffs for OTU clustering based on 16S rRNA.

ADD COMMENT
1
Entering edit mode
7.5 years ago
timodonnell ▴ 80

I also ran into this question, and looks like the answer by Lars above is essentially correct. In particular from this thread on the QIIME forum:

To clarify the 97_otus.fasta rep_set was created by clustering all the sequences in the Greengenes database into 97% identity clusters and then a representative sequence was chosen from each of those clusters to be used to create the 97_tree and 97_taxonomy. Therefore each OTU id in the 97_otus.fasta file I see that representative sequence.

ADD COMMENT

Login before adding your answer.

Traffic: 1788 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6