Last year, Daniel Machado and Marcus Herrgård published a great paper [1] evaluating different methods for mapping from transcript data to flux patterns. Flux predictions for eighteen methods were compared to three E.coli and yeast datasets.
The methods varied widely their requirements: continuous vs. discrete levels (105/cell vs. high/low) and absolute vs. relative expression (105/cell vs. twice as much as a reference condition). I wouldn’t say that Daaaaave (rather boringly called “Lee-12” above) is the best algorithm out there — indeed the paper finds that all the methods perform equally well (badly). But I do think that it lies in the right part of the graph, using continuous, absolute data.
One of the successes of Machado’s paper is making their code and datasets available via github. I’ve now forked it into my own repository, allowing us to test new methods using their suite.
My first test was to attempt to address the problem that
Methods that do not make any assumptions regarding a biological objective (iMAT, Lee–12 and RELATCH*) … incorrectly predicted a zero growth rate in all cases
by enriching the reconstructions: associating the genes encoding the main DNA polymerases with growth. It didn’t make any difference, and I shouldn’t have been surprised. This gene association is one data point amongst thousands, and the cell’s growing would require a major rerouting of flux, thereby moving other many data points away from their best fit.
Back to the drawing-board.
References
- Daniel Machado and Marcus Herrgård (2014) “Systematic evaluation of methods for integration of transcriptomic data into constraint-based models of metabolism” PLoS Computational Biology 10:e1003580.
doi:10.1371/journal.pcbi.1003580
Hey Kieran,
You may want to check out plant models and see what they are doing right. At least in this one: http://arxiv.org/abs/1502.07969. I asked Eli about it and he seemed to get growth rate naturally from his Daaaaave-like objective. I was also amazed.
Although, I should mention that I don’t recall the exact formulation he used for biomass – think it may have been an aggregate rather than a fixed ratio as is the case for most biomass objectives (in which case, possibly less surprising).