1 INTRODUCTION We consider the problem of identifying both the structure and the parameters of an ordinary differential equation (ODE) system from time series data. In recent years, there has been a significant increase in the number of reported methods approaching this problem in the biological literature (see Table 1 and the references). Table 1. Benchmark problems for ODE system identification Problem name Original problem reference Source system/Model #var #exp #pts Noise (%) simpleLin1 Simple linear system 3 3 13 0 simpleLin2 8 13 10 simpleFb1 Simple feedback loop (McKinney et al., 2006) 3 4 7 0 simpleFb2 4 7 5 simpleFb3 1 7 0 simpleFb4 (McKinney et al., 2006) 1 7 ≈5a osc1 An oscillator (Karnaukhov et al., 2007) 3 1 41 0 osc2 (Karnaukhov et al., 2007) 1 41 3 metabol1 (Gennemark and Wedelin, 2007) A metabolic pathway (Arkin and Ross, 1995) 5 12 7 0 metabol2 (Gennemark and Wedelin, 2007) 12 21 10 metabol3 (Gennemark and Wedelin, 2007) 12 21 20 3genes1 A 3-step gene network (Moles et al., 2003) 8 16 21 0 ss_cascade1 (Tsai and Wang, 2005) A cascaded pathway (Voit, 2000) 3 8 41 0 ss_cascade2 4 41 0 ss_cascade3 8 41 5 ss_branch1 (Voit and Almeida, 2004) A branched pathway (Voit, 2000) 4 3 21 0 ss_branch2 (Marino and Voit, 2006) 6 51 0 ss_branch3 (Tucker and Moulton, 2006) 5 20 0 ss_branch4 (Kutalik et al., 2007) 4 20 0 ss_branch5 (Kutalik et al., 2007) 4 20 2.5 ss_branch6 (Gonzalez et al., 2007) 5 31 0 ss_5genes1 (Kikuchi et al., 2003) A genetic network (Hlavacek and Savageau, 1996) 5 10 11 0 ss_5genes2 (Gennemark and Wedelin, 2007) 10 9 20 ss_5genes3 (Gennemark and Wedelin, 2007) 10 3 0 ss_5genes4 (Kimura et al., 2005) 15 11 0 ss_5genes5 (Daisuke and Horton, 2006) 10 11 0 ss_5genes6 (Cho et al., 2006) 1 16 0 ss_5genes7 (Tucker and Moulton, 2006) 10 20 0 ss_5genes8 (Tsai and Wang, 2005) 8 41 0 (Liu and Wang, 2008) ss_15genes1 A genetic network (Maki et al., 2001) 15 10 11 0 ss_15genes2 20 11 10 ss_30genes1 A genetic network (Maki et al., 2001) 30 15 11 0 ss_30genes2 (Kimura et al., 2005) 20 11 10 ss_30genes3 (Liu and Wang, 2008) 8 41 0 cytokine1 (McKinney et al., 2006) Immunologic data (Rock et al., 2004) 4 1 7 10a cytokine2 1 7 10a ss_ethanolferm1 (Liu and Wang, 2008) Ethanol fermentation (Wang et al., 2001) 4 2 11–15 ≈30a ss_ethanolferm2 3 11–19 ≈30a ss_sosrepair1 (Cho et al., 2006) SOS repair system Escherichia coli (Ronen et al., 2002) 6 1 50 10a ss_sosrepair2 1 50 10a ss_cadBA1 (Gonzalez et al., 2007) cadBA network in E. coli (Kuper and Jung, 2005) 5 1 25 <20a ss_cadBA2 1 25 <20a ss_clock1 (Daisuke and Horton, 2006) Mice cell cycle (Barrett et al., 2005) 7 1 12 ≈10a ss_clock2 1 12 ≈10a aEstimate from or assumption about data. See web site for further information. #var, number of dependent variables; #exp, number of experimental conditions with different initial conditions and/or input functions; #pts, number of data-points per time series. Noise is added from a Gaussian distribution with SD given as a certain percentage (denoted Noise) of each experimental value. Problem names starting with ‘ss_’ correspond to S-systems. The last section lists problems with real data from biological systems. For systems of realistic size and with realistic amount and quality of data this is a hard problem and heuristic algorithms are therefore typically employed. Such algorithms cannot guarantee that they give the best answer to a given problem. An established way to evaluate the performance of heuristic algorithms is to try them on a sufficient number of realistic test cases. However, lack of well defined and commonly accepted benchmark problems makes it difficult to evaluate and compare different methods: Problems are often incompletely and/or ambiguously specified, and personal communication is required to reconstruct essential aspects of the problems. Very few test problems are usually considered. This serves the purpose of demonstrating feasibility, but says little about general performance. Even when problems originate from the same system, different papers typically consider slight modifications of the original problem, which make direct comparisons difficult. An example is a test problem from Kikuchi et al., 2003. Several recent publications (Cho et al., 2006; Daisuke and Horton, 2006; Gennemark and Wedelin, 2007; Kimura et al., 2005; Liu and Wang, 2008; Tsai and Wang, 2005; Tucker and Moulton, 2006) consider the same system, but usually with a more informative dataset. To a considerable degree, the lack of benchmark problems is caused by a more fundamental underlying difficulty in how to specify such identification problems mathematically, and how to conveniently represent them in files. This is much more complex when both structure and parameters are unknown, compared with the well known case of parameter estimation in a given structure. In particular, it is not sufficient to give only data from one or several experiments, or a source model from which such data can be simulated, see Figure 1A. In addition to the data, a model space must be defined, to specify the possible forms of the right-hand side of the ODEs and appropriate parameter ranges. Clearly, if we allow few possible reactions and narrow parameter ranges we obtain a simpler identification problem, and if we allow many possible reactions and wide parameter ranges we obtain a more difficult problem. Furthermore, known prior information about parts of the model may be given. Finally, we must define an objective function that also includes some notion of model complexity. This is needed since we are searching among different structures, and a simple maximum likelihood criterion is not sufficient and will lead to over-fitting. Together, this defines the identification problem mathematically as an optimization problem, the solution of which is a model. An illustration of an identification problem is shown in Figure 1B, and the solution model is shown in Figure 1C. Fig. 1. Identification of ODE systems. (A) An identification problem can be specified with real or simulated data from one or several experiments, a model space of allowed reactions occurring on the right-hand side of the ODEs, an initial model representing prior knowledge and an error function. (B) An example of an identification problem with a model space of three traditional chemical reaction types, an error function where L=likelihood function, λ K=structural complexity term (λ=constant and K=number of model parameters) and an initial model with no prior information. (C) An example of a solution model. Based on these considerations, we present a collection of more than 40 fully specified and publicly available benchmark problems for ODE system identification. The problems have mainly been collected and translated from recent publications, but we have also included new and carefully designed problems. The collection includes both simple problems suitable for initial testing, and more complex problems with realistic size and quality of data. Most problems are based on simulated data from a known source model, which is important for evaluation purposes. However, several problems are also based on real data. The collection consists of problems of two different kinds. The first kind are problems whose solutions are models based on traditional chemical rate equations like unimolecular mass action and Michaelis–Menten's kinetics. An example of such a system with three variables is shown in Figure 1C. Typically for systems of this kind, the rate equations are composed as linear combinations of known reaction types that can be non-linear in both the parameters and in the state variables. The other kind of problem is where the solutions are defined on a fixed form, like S-systems which is currently the most popular form for identification. The S-system formalism (Savageau, 1976; Voit, 2000) is based on approximating kinetic laws with multivariate power-law functions. A model consists of n non-linear ODEs and the generic form of equation i reads (1) where X is a vector (length n) of dependent variables, α and β are vectors (length n) of non-negative rate constants and g and h are matrices (n × n) of kinetic orders, that can be negative as well as positive. The structure of the system is defined by the non-zero elements of g and h, and for realistic models most of these parameters are zero. To our knowledge, this is the first article to address how to best represent this kind of identification problems as optimization problems and to define a suitable file format. As far as we know, this is also the first collection of benchmark problems for ODE system identification, where both structure and parameters are unknown. Another paper that recognizes the importance of systematic evaluation of identification algorithms, is Mendes et al. (2003), which presents a system for generating realistic artificial gene networks and simulated data, and a large number of problems generated by this system. In comparison, our benchmark problems are typically smaller and address the more specific need of evaluating algorithms identifying both structure and parameters, fully specifying problems as optimization problems. For parameter estimation only, several benchmark problems are publicly available, e.g. in the data base EASY-FIT (Schittkowski, 2002). The two kinds of systems we consider, chemical rate equations and S-systems, have previously been used for identification of ODE models as indicated by references in Table 1. In addition to these kinds of systems, other fixed forms have been considered in the literature. One example is Generalized Mass Action (GMA) models (Tucker et al., 2007; Voit, 2000), and another is linear dynamical models with non-linear transfer functions of sigmoid type (D'haeseleer, 2000; Nelander et al., 2008; Wahde and Hertz, 2000). However, at this stage, we have chosen to focus on the more frequently used S-systems as one representative fixed form system. We finally note that identification of ODEs where more basic operands and operators are used as structural components have been considered (Bongard and Lipson, 2007).