Uncertainty regarding a model’s projections can arise from a variety of reasons (Bilcke et al., 2011; Creedy et al., 2007). In particular, sources of uncertainty are generally distinguished in (i) input data, for instance due to sampling errors in the initial population, (ii) model structure, that is the validity of the general modelling approach used (also called "methodological uncertainty"), (iii) model specification, which concerns the choice of the covariates and the functional forms used, and in particular the crucial assumption that any regularity observed in the data will not break up in the future, (iv) model parameters, pointing to the imprecision of the estimates and/or externally provided parameters, and finally (v) Montecarlo variation of the model output, which originates from the fact that the simulated aggregate quantities are also imprecise estimates of the theoretical aggregate quantities that the model implicitly defines. None of the above sources of uncertainty is generally considered in microsimulation studies, although this is recognised and criticised (see for instance Goedemé et al, 2013). However, "[t]he calculation of confidence intervals around model results that account for all sources of error remains a major challenge" (Mitton et al., 2000).
Generally speaking, source (i) should be limited, due to the use of appropriate input data and sampling weights. Sources (ii)-(iii) are often left unexplored, by making the common assumption that the model is well specified (measures of fit should be reported for each estimated equation to corroborate this hypothesis). Montecarlo variation of the model outcome (source v) can be brought down to negligible by appropriately scaling up simulated population size. The remaining source of uncertainty that needs to be addresses is therefore parameters uncertainty, stemming from sampling errors in estimation (source iv).
There are two approaches that can be used to deal with this uncertainty (Creedy et al., 2007). The first is what we might label "brute force", and prescribes to bootstrap the coefficients of the estimated equations from their estimated distribution (e.g. multivariate normal in case of multinomial probit regressions) with mean equal to the point estimate, and variance-covariance matrix equal to the estimated variance-covariance. Bootstrapping needs to be performed only once, at the beginning of the simulation: the entire simulation is then performed with the bootstrapped values of the coefficients. The second approach provides an approximation by assuming from the onset a normal distribution for the resulting confidence intervals, requiring many fewer draws from the parameter distribution.
JAS-mine allows for a simple implementation of the "brute-force" approach, by exploiting the bootstrapping feature of its Regression library within a multi-run implementation: the simulation is run many times (e.g. 1,000 times), each using a different set of coefficients. The result is a distribution of model outcomes, around the central projections obtained with the estimated coefficients.