Question

Gene abundance profile from metagenomics data

0

Entering edit mode

7.6 years ago

932477002 • 0

Hi, alls

I was dealing with a metagenome dataset.

After prediction, I have created a catalogue consisting of all genes found from the dataset, and now I want to extrcat gene profiles (i.e. a list of genes with relative abundance).

Do you know what tools can be used to profile the gene abundance list?

I would apperciate your kind help!

gene • 3.0k views

ADD COMMENT • link updated 6.1 years ago by Biostar 20 • written 7.6 years ago by 932477002 • 0

score 0 · Answer 1 · 2017-06-01

I'm not aware of such a tool. You can try and classify your genes using interpro domains, MetaCyc/KEGG orthology groups etc. Depending on your depth of sequencing and sample complexity I would consider reference-based counting (i.e. mapping the reads to a reference using blastx) instead of assembly based approached you took.

score 0 · Answer 2 · 2018-07-31

There are two approaches to do it.

I suppose you want to get only abundances of genes irrespective of organisms. Then can cluster all genes with CD-HIT or other tool and then take representative from each cluster to find KO or COG ids/description.
Second is, use UProC to classify your genes and get KO ids. Then you can simply consolidate your resutl. (count of KO's)

Other than these approach, you can assemble your metagenome and map query genes to each contig (which you will get). Then multiply number of hits with depth of sequencing. It is not accurate way. It will give only relative abundances only.