

VALIDATION OF EXPLORATORY MODELING
Steve Bankes and James Gillogly
ABSTRACT Exploratory modeling uses computational experiments with computer models to inform questions of interest. Exploratory modeling is primarily useful for situations where insufficient information exists to build a veridical model of the system of interest. The problem of how to cleverly select the finite sample of models and cases to examine from the infinite set of possibilities is the central problem of exploratory modeling methodology. Thus, in exploratory modeling, rather than validate models, one must validate research strategies. This validation centers on three aspects of the analysis: the specification of an ensemble of models that will be the basis for exploration, the strategy for sampling from this ensemble, and the logic used to connect experimental results to study conclusions. The ability of researchers to achieve useful results in this paradigm can be greatly enhanced by the syntactic definition of compound computational experiments resulting in the automatic generation of large numbers of individual experiments, and tools supporting the visualization of these results.
EXPLORATORY MODELING Exploratory modeling uses computational experiments with computer models to inform questions of interest. Computational experiments produce new information just as experiments in chemistry, physics, or biology do. Unlike physical experiments however, the results of computational experiments do not yield information about the external world, but rather information about the hidden world of mathematics. The abstract question "what will happen if algorithm A is executed on dataset B?" could in principle be addressed through deductive proof. Often however, it is more practical to simply perform the calculation and see. As it is quite easy to create computer programs whose behavior is unexpected, computational experiments are a rich source of new information. For example, if we wish to know whether the trajectory of a system of differential equations is chaotic, simulating this trajectory numerically may be the first step. As the power of computers increases, it is becoming feasible to calculate mathematical facts that the unaided human mind might deduce only with great difficulty, if at all. Thus, one may now speak without hyperbole of experimental mathematics.
A model that has been experimentally validated to predict system behavior to within some bounded error can be used as a surrogate for the system of interest. The methodology of building such a model is to consolidate all relevant data and theory into a single model that will be used to reason about the system once it has been validated by comparison with physical experiments. The uncertainty of the resulting model outputs is bounded, and can be estimated by sensitivity analysis.
Unfortunately, for many systems of interest, the construction of models that may be used as surrogates for the system of interest is simply not a possibility. This may be due to a variety of factors including the infeasibility of critical experiments, immaturity of theory, or the nonlinearity of system behavior, but is fundamentally a matter of not knowing enough to make predictions. For such systems, a methodology based on consolidating all known information into a single model and using it to make best estimate predictions can be highly misleading.
In the process of constructing a computer model of such a system, some number of "guesses" must be made. Running such a model is a computational experiment that reveals how the system would behave if those guesses were correct. Such computational experiments, whose outputs cannot be regarded as predictions, can be used to examine ranges of possible outcomes, to suggest hypotheses to explain puzzling data, to discover significant phases, classes, or thresholds among the ensemble of plausible models, or to support reasoning based upon an analysis of risks, opportunities, or scenarios (Bankes 1993).
While the use of computational experiments is becoming increasingly popular, appreciation of the significant methodological differences between consolidative/predictive modeling strategies and exploratory modeling is not widespread. This creates problems of two kinds. First, the results of an "interesting" computational experiment are on occasion used as though they were the outputs of a validated predictive model, leading to confusion, self deception, and potentially intellectual fraud. Second, the unexamined adoption of consolidative modeling methodologies and associated software tools presents a barrier to facile utilization of computational experiments to support reasoning.
Exploratory modeling is primarily useful for situations where insufficient information exists to build a veridical model of the system of interest. In such situations, the normal modeling approach of building a best estimate model based on existing facts, and using that model to learn new properties of the system, fails due to the inaccuracy of the "best estimate" model to correctly capture important features of the target system. In such a situation, it is still possible to reason over the information that is known. Such incomplete information can imply properties true of all models consistent with it. Consequently, insights may be gained through examining the results of structured constellations of computational experiments, that may not be obtained by examining one "ideal" experiment. This situation is totally analogous to that faced by laboratory experimenters, where series of experiments are often required to inform an investigation. Our cultural prejudice towards viewing models as valid images of reality has concealed this analogy for some time. Nonetheless, successful examples of the use of computational experiments to support reasoning reveal the necessity of examining large numbers of cases.
In effect, data about a system of interest serves to constrain the ensemble of plausible models, and our uncertainty corresponds to the size of that ensemble. When sufficient information exists to strongly constrain the ensemble of plausible models, reasoning can be supported by examining the properties of one exemplary case. As computer power becomes more abundant, we have a new possibility of reasoning about systems in which the ensemble of plausible models is less well constrained. Such reasoning must involve examining large numbers of examples drawn from this ensemble. The methodological question of how to select a limited sample from a large or infinite number of modeling experiments is only now beginning to be addressed.
Potential applications for exploratory modeling are easy to identify in the decision sciences, as the need to reason using incomplete information in the presence of uncertainty is quite real for human decision makers. Techniques for reasoning with incomplete information can be augmented by exploratory modeling using computational experiments. Risk averse analyses based upon compiling lists of disaster scenarios can be supported by search for failure modes among the ensemble of plausible models. The use of decision trees to reason over a universe of cases, can be supported by a search through an ensemble of models to discover the main phases or regions of model space have salient qualitative differences in outcomes.
The problem of how to cleverly select the finite sample of models and cases to examine from the infinite set of possibilities is the central problem of exploratory modeling methodology. A wide range of research strategies are possible, including sampling based on researcher intuition, structured case generation by Monte Carlo or factorial experimental design methods, search for extremal points of cost functions, or more complex sampling methods that combine human insight with the search for regions of "model space" with qualitatively different behavior. Exploration can be over both real valued parameters and nonparametric uncertainty such as that between different graph structures, functions, or problem formulations.
VALIDATION AND EXPLORATORY MODELING Historically, a modeling exercise is viewed as successful once a model has been validated by predicting the results of experiments. Clearly, this means of quality assurance is not available for exploratory modeling, since by definition this sort of validation is not a possibility. Instead, quality assurance in exploratory modeling can be thought of as having two components: 1) that the experiments are "well crafted" (this is often called verifications), and 2) that the strategy of sampling from the ensemble of models is consistent with the use to which the results of exploration were put. Thus, in exploratory modeling, rather than validate models, one must validate research strategies. This corresponds to the reality of laboratory researchers, where individual experiments must be correctly executed, but beyond that the critical judgment is whether the experiments performed actually support the conclusion drawn by the experimenter.
In order for an "exploratory" model to be helpful, that is to say in order to have valid use of one, justification for using the model must be provided outside of that model. There must be a research strategy that somehow compensates for the presence of the guesses, that provides a context in which the outputs of the model are informative in spite of the fact that it doesn't predict system behavior. There are a variety of ways that such a research strategy can be concocted. Examples include determining optimal or average outcomes over a distribution of "guesses", surveying a range of guesses and compiling a finite list of qualitatively different behaviors/outcomes and reasoning over these to suggest policy, looking for worst/best cases, and a fortiori arguments.
Note that even for models that are used to predict system behavior, there is always some residual difference between model prediction and measured outcomes in validating experiments. How small the residual difference may be and still have a "validated" model depends on the way the model is to be used. Further, for any model there are limitations to the range of uses for which the model is validated. Thus, no model is perfectly validated, and model validation must be assessed in the context of its intended use. Models and uses must be validated in tandem. We have become accustomed to thinking about "model validation", only because for engineering models there is an understood use, that of making best estimate predictions of target system performance.
For exploratory modeling, validation centers on three aspects of the analysis: the specification of an ensemble of models that will be the basis for exploration, the strategy for sampling from this ensemble, and the logic used to connect experimental results to study conclusions. The specification of an ensemble of models of interest can be questioned in terms of the reasonability of the assumptions this specification embodies, with the biggest errors being those that unfairly exclude models from consideration. The strategy for sampling from this ensemble must clearly be rationalized in terms of the logic of the analysis if the results of computational experiments are to be useful. Thus, while individual models are subject to verification, it is studies that must be validated. To speak of the validation of models in an exploratory modeling context is incorrect and misleading. This situation is akin to that of actual experimental science, where the experimental equipment must be thoroughly checked to insure that experimental results are as measurement shows them to be (verification), but the validity of the research conclusions is based on a logic that connects them to the experiments that were actually conducted.
XPLORE: AN ENVIRONMENT FOR EXPLORATORY MODELING We are currently developing a prototype software environment called XPLORE that better supports the use of exploratory modeling. In this environment, it is possible to syntactically define the ensemble of models of interest, and to create algorithms for automatically generating computational experiments drawn from that ensemble. This allows the user to specify compound computational experiments, that result in the automatic generation of large numbers of individual computational experiments. The outcomes of computational experiments, whether atomic or compound, can be displayed textually. Typically, however, graphical display is required by the volume of data provided, and is desired due to the superior qualities of appropriate visualizations for producing insight. A large number of standard graphical tools can be applied to visualize the results of computational experiments by providing a mapping of axes of information in a visualization to fields in a results database. Hand crafted visualizations may be produced for particular applications. The exploratory modeling environment provides for the application of standard graphical display such as those available through scientific visualization packages or statistical modeling systems (Tierney 1990). We have also been developing customized visualizations significant for policy studies. Figures 1 and 2 display visualizations produced for two applications that have been supported by prototype versions of the environment, one of drug policy and the other a study of global warming abatement policy.
In figure 1 is a screen dump of a visualization that displays a 2D slice through an Ndimensional space, with the policy outcome colorcoded on the screen at the appropriate points. In this case, the color at each point in parameter space corresponds to the identity of the notional drugcontrol program with the greatest marginal costeffectiveness, given the assumptions corresponding to this point. This display show the results of a compound experiment in which five parameters were systematically varied. For this display, the user selects any two of these parameters to be used as axes for the display, the other parameters can be manipulated with sliders. The user can with this display explore the Ndimensional parameter space by manipulating the slider combinations.
In figure 2 is a view of a similar graphical tool utilizing a 3D surface display. This graphic was produced as part of a study of global warming abatement policy. It displays surfaces of Type I and type II error rates* against a variety of assumptions about both the climate system and the policy algorithm that decides on policy in response to new climate data. By manipulating sliders, the user can vary assumptions and examine the impacts on the rate of both types of errors.
By providing users with the ability to specify compound computational experiments, and to graphically view the outcomes of these compound experiments, the XPLORE environment will provide a technology base that will allow users to shift their focus from developing and validating single models, to developing and exploring ensembles of modeling experiments, and validating reasoning based upon them.
References
[1] Bankes, S. 1993. "Exploratory Modeling for Policy Analysis", Operations Research, vol. 41, No. 3: 435449.
[2] Tierney, L. 1990 LISPSTAT: An ObjectOriented Environment for Statistical Computing and Dynamic Graphics, John Wiley & Sons, New York.
> return to News, Press and Publications section
> up




