In 2012, we set out a pipeline  for using predicting flux through a metabolic network, using genetic or proteomic data.
Two inputs are required:
- a genome-scale metabolic network containing gene-protein-reaction (GPR) associations, such as human or yeast
- quantitative estimates of their enzyme levels, via proteomics or absolute transcriptomic data
To predict metabolic flux, two steps are taken:
- enzyme-level data are combined with GPR associations, to create reaction-level data
- reaction-level data are constrained by mass-balancing, to create system-level flux data
A number of methods have been proposed to perform this second mapping, maximising the correlation between reaction data and system flux, including Daaaaave [1, developed in Manchester], Gimme  and iMat . I won’t discuss their relative merits here, but will instead focus on the first step.
The mapping isn’t quite as easy as it seems, due to the many-to-many relationship between enzymes and metabolic reactions. Suppose we have determined enzyme levels of A = 4, B = 3 and C = 2. How should we interpret their various combinations as GPRs?
|2||A or B||4 + 3 = 7|
|3||A and B||min(4,3) = 3|
|4||(A and B) or (A and C)||min(4,3) + min(4,2) = 5|
|5||A and (B or C)||min(4,3 + 2) = 4|
Given the simple mapping in reaction 1, we may use the enzyme level directly. The “or” relationship in reaction 2 allows for alternative catalysts, so its total level is given by the sum of its components. The “and” relationship in reaction 3 means it is catalysed by a complex, whose maximum possible concentration is given by the minimum level of its components. These two min/plus rules may be combined for more complex GPRs such as in reactions 4 and 5.
Here’s the rub: the Boolean logic used in 4 and 5 are the same (just bracketed differently), yet the levels output by the mapping are different. The existing approach is badly-defined and we need to go back to the drawing-board. One option — proposed by Brandon Barker — is to insist that GPRs are always written in a consistent manner. For example, we could use the disjunctive normal form (DNF), meaning that GPRs are expressed as an “or of ands”: a list of alternative complexes, as per reaction 4. I think that having consistency like this is an excellent idea. But I also think that our mapping should be robust to alternative bracketing; I’ll show you how, next time.
- Lee D, Smallbone K, Dunn WB, Murabito E, Winder CL, Kell DB, Mendes P, Swainston N (2012) “Improving metabolic flux predictions using absolute gene expression data” BMC Systems Biology 6:73.
- Becker SA, Palsson BØ (2008) “Context-specific metabolic networks are consistent with experiments” PLoS Comp Biol 4:e1000082.
- Shlomi T, Cabili MN, Herrgård MJ, Palsson BØ, Ruppin E (2008) “Network-based prediction of human tissue-specific metabolism” Nature Biotechnology 26:1003–1010.