Hi,
I have an OTU biom file (obtained from Closed reference QIIME 1.8.0 v) contains 65 samples, I am trying to do analysis for PAN/CORE genome.
I have filtered out the taxonomy from the abundance file (with particular threshold, lets say 60 %), now I have an taxonomy column only in file from all the 65 samples (with threshold 60%), Is there is a way where I can do the functional annotation for it?
Any server/ software is there which can do that? or which do pan (complete) /core (shared) analysis
Any suggestions ?
Best!
Shashank
What you mean with PAN and CORE genome analysis is that you want to find the complete genes/proteins and shared genes/proteins among your OTUs, right? I'm not familiar with biom file, but does it contain the genome sequence(s) of the organisms that you're analyzing? You will need the gene sequences of the whole genomes (or protein sequences of the whole proteomes) to get the pan-genome and core-genome.
Yes, Complete gene and Shared gene.
Biom file don't have the genomic sequence. It looks like-
Where number represents the OTU ID, followed by taxonomy. OTU ID represents the particular sequence associated with the particular taxonomy.
If I incorporated the gene sequence by using the OTU ID corresponding to the taxonomy, Now I have a gene sequence file, than how can I use it for further analysis ?
Cheers!
"OTU ID represents the particular sequence associated with the particular taxonomy." What particular sequence is it? From one gene only? You can't do pan- and core-genome analysis using only one gene from each species/OTU. You need the genome (or better, the proteome) from each OTU, find orthologous gene/protein among the OTUs (I used OrthoMCL for my bacteria), and there you have the core-proteome. The pan-proteome would be the core plus any other proteins of each OTU.