next up previous contents
Next: Pathway inference Up: Network Analysis Tools (NeAT) Previous: Metabolic path finding   Contents


KEGG network provider


KEGG network provider allows you to extract metabolic networks from KEGG [15] that are specific to a set of organisms. In addition, you can exclude certain compounds or reactions from these networks.

A range of tools works with KGML files. Click on ``Manual -> Related tools'' to see a selection of them. KEGG network provider differs from these tools by allowing also the extraction of RPAIR networks and by supporting filtering of compounds, reactions and RPAIR classes.

KEGG network provider itself has no network analysis or visualization functions, but you can use a NeAT tool (a choice of them will appear upon termination of network construction) or any other graph analysis tool that reads gml, VisML or dot format for these purposes.

For visualization of KEGG networks, you can use iPATH [22], KGML-ED [17] or metaSHARK [12]. Yanasquare [27] and Pathway Hunter Tool [26] offer organism-specific KEGG network construction in combination with analysis functions. With [36], you can construct KEGG metabolic networks in R.

It should be noted that KEGG annotators omitted side compounds in the KGML files. Thus, certain molecules (such as CO2, ATP or ADP) might be absent from the metabolic networks extracted from these files.

It is also worth noting that constructing metabolic networks from KGML files produces networks of much lower quality than those obtained by manual metabolic reconstruction. In manual reconstruction, several resources are taken into account, such as the biochemical literature, databases and genome annotations (e.g. [8]). This is why the metabolism of only a few organisms has been manually reconstructed so far.
In automatically reconstructed networks, reactions might not be balanced and compounds might occur more than once with different identifiers (see e.g. [25] for annotation problems in KEGG). For the purpose of path finding the automatically reconstructed metabolic networks may still be of interest.

Construction of yeast and E. coli metabolic networks

Study case

Our study case consists in the construction of two metabolic networks: one for five yeast species and the other for Escherichia coli K-12 MG1655. We will compare path finding results obtained for these two networks for a metabolic reference pathway (Lysine biosynthesis).

Protocol for the web server

  1. In the NeATmenu, select the entry Download organism-specific networks from KEGG.

    In the right panel, you should now see a form entitled ``KEGG network provider''.

  2. Click on the button DEMO located at the bottom of the form.

    The KEGG network provider form has now loaded the organism identifiers of five yeast species. As explained in the form, the species concerned are: Saccharomyces bayanus, Saccharomyces mikatae, Saccharomyces paradoxus, Schizosaccharomyces pombe and Saccharomyces cerevisiae.

  3. Click the checkbox directed network to construct a directed metabolic network.

  4. Click on the button GO.

    The network extraction should take only a few seconds. Then, a link to the extracted network is displayed. In addition (for formats tab-delimited and gml), the Next step panel should appear.

  5. Click on the button ``Find metabolic paths in this graph'' in the Next step panel. This button opens the Metabolic pathfinder with the yeast network pre-loaded.

  6. Enter C00049 (L-Aspartate) as source node and C00047 (L-Lysine) as target node.

  7. In section Path finding options, set the rank to 1. We are only interested in the first rank.

  8. In section Output, select Graph as output with ``paths unified into one graph''

  9. Click GO. The seed node selection form appears to confirm our seed node choice.

  10. Click GO. After no more than one minute of computation, the graph unifying first rank paths between L-aspartate and L-lysine should appear. You can store the graph image on your machine for later comparison.

Repeat the previous steps, but instead of selecting DEMO in the KEGG network provider form, enter eco in the organisms text input field. Make sure to select directed network in the KEGG network provider form, then follow steps 4 to 10 as described above.

Protocol for the command-line tools

The command-line version of this tutorial is restricted to the E. coli and S. cerevisiae metabolic networks. It is assumed that you have installed the required command-line tools.

  1. First we construct the directed metabolic network of E. coli.

    		java graphtools.util.MetabolicGraphProvider -i eco -d -o eco_metabolic_network_directed.txt

  2. Then, we search for the lightest paths in this network as follows:

    		java graphtools.algorithms.Pathfinder -g eco_metabolic_network_directed.txt -f tab -s C00049
    		-t C00047 -r 1 -d -y con -b -T pathsUnion -O gml -o lysinebiosyn_eco.gml

  3. To visualize the inferred pathway, you may open lysinebiosyn_eco.gml in Cytoscape or in yED.

  4. We proceed by constructing the metabolic network of S. cerevisiae:

    		java graphtools.util.MetabolicGraphProvider -i sce -d -o sce_metabolic_network_directed.txt

  5. Then, we enumerate paths between L-aspartate and L-lysine in it:

    		java graphtools.algorithms.Pathfinder -g sce_metabolic_network_directed.txt -f tab -s C00049
    		-t C00047 -d -r 1 -y con -b -T pathsUnion -O gml -o lysinebiosyn_sce.gml

  6. As before, we can visualize the lysinebiosyn_sce.gml file in a graph editor capable of reading gml files (such as yED or Cytoscape).

Interpretation of the results

After having executed the steps of this tutorial, you should have obtained two pathway images: one for the yeast network and one for the E. coli network. Both pathways differ quite substantially. If we compare each of these pathways with the respective organism-specific pathway map in KEGG, we notice that the pathway inferred for the E. coli network reproduces the reference pathway correctly.
The yeast pathway deviates from the S. cerevisiae KEGG pathway map from L-aspartate to but-1-ene-1,2,4-tricarboxylate, but recovers otherwise the reference pathway correctly (ignoring the intermediate steps 5-adenyl-2-aminoadipate and alpha-aminoadipoyl-S-acyl enzyme associated to EC number
For comparison purposes, we have chosen the same start and end compound for both metabolic networks, but it should be noted that the reference lysine biosynthesis pathway in S. cerevisiae starts from 2-oxoglutarate.

The lysine biosynthesis KEGG map for yeast is available at:

The one for E. coli is available at:


The study case demonstrated that different organisms may employ different metabolic pathways for the synthesis or degradation of a given compound. For this reason, it is useful to be able to construct metabolic networks that are specific to a selected set of organisms.


  1. An empty graph (with zero nodes and edges) is returned. Make sure that the entered organism identifiers are valid in KEGG. They should consist of three to four letters only. If in doubt, check in the provided KEGG organism list.

next up previous contents
Next: Pathway inference Up: Network Analysis Tools (NeAT) Previous: Metabolic path finding   Contents
RSAT 2009-09-04