## Parameter Inference

The parameter inference problem is simply inferring the parameters of a descriptive model (e.g., simulator) from observed data. In cases where the likelihood function corresponding to the model is available, techniques such as maximum likelihood estimation (MLE) can be used. A more interesting case is of likelihood-free parameter inference where inference must proceed solely based on observed data and availability of the descriptive model/simulator. Approximate Bayesian computation (ABC) is an established method for such problems, and involves sampling parameter values from a specified *prior* distribution. The sampled parameters are then simulated and compared to observed data using a distance function, and often in terms of low-level features (*summary statistics*). If the simulated output is close enough to observed data within a *tolerance bound*, the sample is accepted. Once a desired number of accepted samples have been accumulated, they form the *posterior distribution* of inferred parameters. In practice, ABC parameter inference can be slow and might require a large number of rejection sampling iterations. However, in recent times there has been a lot of progress on improving various aspects of ABC, including selection of priors, summary statistic selection and use of adaptive tolerance bounds.

## Selected Research

**Construction of Priors**

The uniform prior is a very popular and general choice of the prior function in ABC inference when problem specific prior knowledge is not available. Practical high quality parameter inference becomes challenging as problem dimensionality (in terms of parameters) increases. With no prior knowledge, one usually assumes a large-enough uniform prior; The wider the prior, the slower parameter inference will proceed. Therefore, having a tight prior can greatly improve parameter inference speed and quality.

As a pre-processing step, multi-objective optimization can be used to find a region of small distances between observed data and simulated samples from a wide prior [1]. Treating each summary statistic as an objective to be minimized (distance), the resulting Pareto set (figure right-bottom) quickly provides an intuitive map of where the region with a high probability of having the correct parameters lies (knee of the Pareto set). The input variables corresponding to the cluster of samples at the knee of the Pareto set can be looked up to substantially tighten the prior.

[1] Singh, P., & Hellander, A. (2018). Multi-objective optimization driven construction of uniform priors for likelihood-free parameter inference. In *ESM 2018, October 24–26, Ghent, Belgium* (pp. 22-27). EUROSIS.

**Optimizing ABC Hyperparameters**

Successful ABC parameter inference involves appropriate selection of a number of quantities, including the prior function, the distance function, summary statistics and the tolerance bound. These quantities are typically selected by the practitioner relying on domain knowledge, past experience, some intuition and problem characteristics.

The ABC configuration has consequences not only on parameter inference quality, but also computation time. Bayesian optimization of ABC configuration [1] allows specification of an objective function in terms of parameter inference quality or parameter inference time or a combination of both. Optimizing such an objective function results in the optimal selection of the prior function *p*, distance function *d*, summary statistics *s* and tolerance bound *τ* (figure on the right).

[1] Singh, P., & Hellander, A. (2018, December). Hyperparameter optimization for approximate bayesian computation. In *Proceedings of the 2018 Winter Simulation Conference* (pp. 1718-1729). IEEE Press.

**Surrogate Models of Summary Statistics**

Parameter inference problems sometimes consist of highly complex simulators. As an example, stochastic biochemical reaction networks in computational biology often involve tens of reactions taking place among several interacting proteins. The number of control parameters of the simulator may be several tens. Using such a complex simulation model often incurs substantial computational cost in inference.

An efficient approach is to train a surrogate model [1] of only the species of interest, taking part in the parameter inference process. The surrogate model learns the mapping from the parameter space (thetas) to the summary statistic space (figure right-top). The prediction times for ~15000 samples for a test problem of 6 parameters is shown in the table (right-bottom). The surrogate model delivers several orders of magnitude speed-up. Interestingly, the surrogate model can be optimized to obtain a point-estimate of inferred parameters.

[1] Singh, P., & Hellander, A. (2017, December). Surrogate assisted model reduction for stochastic biochemical reaction networks. In *Proceedings of the 2017 Winter Simulation Conference* (p. 138). IEEE Press.