In 2012, we set out a pipeline [1] for using predicting flux through a metabolic network, using genetic or proteomic data.
Two inputs are required:
- a genome-scale metabolic network containing gene-protein-reaction (GPR) associations, such as human or yeast
- quantitative estimates of their enzyme levels, via proteomics or absolute transcriptomic data
To predict metabolic flux, two steps are taken:
- enzyme-level data are combined with GPR associations, to create reaction-level data
- reaction-level data are constrained by mass-balancing, to create system-level flux data
A number of methods have been proposed to perform this second mapping, maximising the correlation between reaction data and system flux, including Daaaaave [1, developed in Manchester], Gimme [2] and iMat [3]. I won’t discuss their relative merits here, but will instead focus on the first step.
The mapping isn’t quite as easy as it seems, due to the many-to-many relationship between enzymes and metabolic reactions. Suppose we have determined enzyme levels of A = 4, B = 3 and C = 2. How should we interpret their various combinations as GPRs?
reaction | GPR | level |
1 | A | 4 |
2 | A or B | 4 + 3 = 7 |
3 | A and B | min(4,3) = 3 |
4 | (A and B) or (A and C) | min(4,3) + min(4,2) = 5 |
5 | A and (B or C) | min(4,3 + 2) = 4 |
Given the simple mapping in reaction 1, we may use the enzyme level directly. The “or” relationship in reaction 2 allows for alternative catalysts, so its total level is given by the sum of its components. The “and” relationship in reaction 3 means it is catalysed by a complex, whose maximum possible concentration is given by the minimum level of its components. These two min/plus rules may be combined for more complex GPRs such as in reactions 4 and 5.
Here’s the rub: the Boolean logic used in 4 and 5 are the same (just bracketed differently), yet the levels output by the mapping are different. The existing approach is badly-defined and we need to go back to the drawing-board. One option — proposed by Brandon Barker — is to insist that GPRs are always written in a consistent manner. For example, we could use the disjunctive normal form (DNF), meaning that GPRs are expressed as an “or of ands”: a list of alternative complexes, as per reaction 4. I think that having consistency like this is an excellent idea. But I also think that our mapping should be robust to alternative bracketing; I’ll show you how, next time.
References
- Lee D, Smallbone K, Dunn WB, Murabito E, Winder CL, Kell DB, Mendes P, Swainston N (2012) “Improving metabolic flux predictions using absolute gene expression data” BMC Systems Biology 6:73.
doi:10.1186/1752-0509-6-73 - Becker SA, Palsson BØ (2008) “Context-specific metabolic networks are consistent with experiments” PLoS Comp Biol 4:e1000082.
doi:10.1371/journal.pcbi.1000082 - Shlomi T, Cabili MN, Herrgård MJ, Palsson BØ, Ruppin E (2008) “Network-based prediction of human tissue-specific metabolism” Nature Biotechnology 26:1003–1010.
doi:10.1038/nbt.1487
Pingback: From genes to fluxes #2 | U+003F
Pingback: Pythooooon | U+003F