Help with understanding the identifier mapping tables in the backend of HUMAnN
0
0
Entering edit mode
8 months ago
O.rka ▴ 740

I’m looking for the identifier mapping tables used in the backend for HUMAnN.

More specifically, the following if they are available:

UniRef50 -> EC UniRef50(or EC) -> KEGG KO EC -> MetaCyc pathway UniRef50 (EC or KO) -> KEGG pathway

I found some files here: site-packages/humann/data/pathways/

I have a few questions:

  • Is it expected for one UniRef50 ID to map to more than one EC in some cases?
UniRef50_G6EMD2        {5.4.2.6, 2.7.1.41}
UniRef50_Q1J7L4        {5.4.2.6, 2.7.1.41}
UniRef50_T0UKK6        {5.4.2.6, 2.7.1.41}
UniRef50_X5NX36        {5.4.2.6, 2.7.1.41}

How is this handled in the backend? Does a UniRef50 hit for these count towards both or only one?

  • I got ID mappings between pathways and rxns from data/pathways/metacyc_pathways. Many of these rxns do not have ECs and are not present in data/pathways/metacyc_reactions_level4ec_only.uniref.bz2. For example, the following:
list(pwy_to_rxns["PWY-2681"])
# ['RXN-4308',
#  'RXN-4305',
#  'RXN-4317',
#  'RXN-4310',
#  'RXN-4306',
#  'RXN-4314',
#  'RXN-4303',
#  'RXN-4307',
#  'RXN-4313',
#  'RXN-4312',
#  'RXN-4304']

pd.Series(rxn_to_ec)[list(pwy_to_rxns["PWY-2681"])]
# RXN-4308             {}
# RXN-4305    {2.5.1.112}
# RXN-4317             {}
# RXN-4310             {}
# RXN-4306             {}
# RXN-4314             {}
# RXN-4303    {2.5.1.112}
# RXN-4307     {2.5.1.27}
# RXN-4313             {}
# RXN-4312             {}
# RXN-4304             {}

The following to confirm:

grep "RXN-4308" metacyc_reactions_level4ec_only.uniref

Are there supposed to be ECs associated with some of the rxns here since it's in the pathway?

metagenomics metacyc enzyme biobakery humann • 512 views
ADD COMMENT
0
Entering edit mode

Those two entry appear to be dismutases and that word seems to refer to many reactions:https://en.wikipedia.org/wiki/Disproportionation

ADD REPLY
0
Entering edit mode

That would make a lot of sense. In that instance, I'm less concerned because after a closer look I realized that's only the case for 4 UniRef50 ids and it's the same 2 EC for all 4. Do you have any insight on teethe rxns missing ECs by any chance?

ADD REPLY

Login before adding your answer.

Traffic: 2220 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6