Last year, Daniel Machado and Marcus Herrgård published a great paper [1] evaluating different methods for mapping from transcript data to flux patterns. Flux predictions for eighteen methods were compared to three E.coli and yeast datasets.
The methods varied widely their requirements: continuous vs. discrete levels (105/cell vs. high/low) and absolute vs. relative expression (105/cell vs. twice as much as a reference condition). I wouldn’t say that Daaaaave (rather boringly called “Lee-12” above) is the best algorithm out there — indeed the paper finds that all the methods perform equally well (badly). But I do think that it lies in the right part of the graph, using continuous, absolute data.
One of the successes of Machado’s paper is making their code and datasets available via github. I’ve now forked it into my own repository, allowing us to test new methods using their suite.
My first test was to attempt to address the problem that
Methods that do not make any assumptions regarding a biological objective (iMAT, Lee–12 and RELATCH*) … incorrectly predicted a zero growth rate in all cases
by enriching the reconstructions: associating the genes encoding the main DNA polymerases with growth. It didn’t make any difference, and I shouldn’t have been surprised. This gene association is one data point amongst thousands, and the cell’s growing would require a major rerouting of flux, thereby moving other many data points away from their best fit.
Back to the drawing-board.
References
- Daniel Machado and Marcus Herrgård (2014) “Systematic evaluation of methods for integration of transcriptomic data into constraint-based models of metabolism” PLoS Computational Biology 10:e1003580.
doi:10.1371/journal.pcbi.1003580