

COMPUTATIONAL EXPERIMENTS AND EXPLORATORY MODELING*
Steven Bankes
We are at the beginning of an era of unprecedented computational resources, in comparison to which, all of history up to the present must be regarded as the era of extreme computational poverty. This reality means that those parts of our scientific heritage that were motivated by computational poverty, no matter how brilliant, are ceasing to be relevant to the solution of the problems of the next century. In an era where computation is essentially free, to spend effort thinking of what computation to perform will often be foolish. Instead, one should perform many computations to guide one to focus thinking on those issues that are truly crucial.
One effect computational poverty has had on our scientific heritage has been to motivate a fascination with "best" models. In an era when calculation was difficult and expensive, most successful science involved the discovery of single useful models for a system of interest. Consequently, techniques for discovering the "best" model for a system make up a major part of our intellectual heritage. These classical approaches to modeling are data and knowledge driven: build a model consolidating all that is known, with the goal of predicting the outcomes of experiments. Such "consolidative" methodologies can produce important science, but provide less help for reasoning in situations where only partial information is available. When too little is known to build a predictive model, or when validating experiments are not possible, the "best" model often provides little value. However, given adequate computational resources, alternative exploratory methodologies involving multiple plausible models can be useful for reasoning from partial information.
One pioneering study demonstrating the potential of reasoning using large numbers of alternative models is that of Suguru Araki at the Jet Propulsion Laboratory, who constructed a parameterized simulation model of the dynamics of the rings of Saturn. While much has been learned about the nature of the rings, there is still a great deal we do not know. Araki's model represents attributes about which there is significant uncertainty, such as the sizes and mechanical properties of the particles that make up the rings. Consequently, his model cannot predict the outcomes of future measurements. Indeed, his model can be thought of as a "model schema" that requires that guesses be made about all such uncertain attributes in order to create a particular model. One such model run is in effect a computational experiment, that reveals how the rings should behave if the guesses were true. By exploring a large number of such computational experiments, one can discover which combination of guesses are consistent with known information. In the case of Araki's model, a geometrically complex region of the parameter space describing the "guesses" turned out to produce unstable rings. This region can be graphically visualized, and as a consequence, new information derived. This new information was inherent in the laws of physics and the observation that the rings are stable and made up of particles, but was not explicitly known until the extensive set of computational experiments that allowed the pattern to be seen.
Araki's approach, comparing the results of alternative simulation models to available data about the rings of Saturn, is an example of exploratory data modeling. The goal of Araki's exploration was not the model which best fit the data, but rather a better understanding of how the available data constrains the plausibility of alternative models.
While Araki's research was based on simulation, exploratory data modeling can be pursued using any model formalism. For a given model formalism, there is an implied set or ensemble of individual models that could be used, resulting in a computational experiment that might reveal something about the data. Depending on the formalism, these computational experiments may or may not involve fitting parameters. For example, if I decide to model a dataset using linear or loglinear regression equations, I have in effect selected an ensemble of such equations that might be tried on the data during the process of exploratory data modeling.
Having specified an ensemble of possible modeling experiments, any specific modeling experiment can be thought of as a sample from this ensemble. The methodological questions for exploratory modeling then become: 1) what is the goal of exploration, and 2) what strategy for sampling from the ensemble of possible computational experiments can serve this goal. In the current practice of data modeling, the goal of exploration is generally assumed to be the identification of the model that best fits the data. There may be circumstances in which the emphasis on "best estimate", "minimal error", "maximal likelihood", or any other criteria for picking a single model may be important. However, in an era where it may be practical to perform a billion regressions on a data set, it is not at all clear that the preoccupation with a single modeling experiment isn't a consequence of our heritage of computational poverty. For many purposes, the goal of exploration can pragmatically be a collection of models that are salient in the context of the research question.
The fixation on discovering best models has limited investigation of the interesting problem of how to sample from an ensemble of models or computational experiments. In typical data modeling exercises, the researcher conducts a "specification search" using a variety of informal heuristics and guesswork, in search of a model with a small enough residual error to quit looking further. This protocol can be rewarding but also has drawbacks. The search for a "best" model may not result in a sample from the ensemble of models that brings the best information to bear on the question that the researcher is ultimately trying to answer. The often informal heuristics used by the user doing the search may be inappropriate for the goals of the analysis. And human mediated iterative model revision may not be the most efficacious procedure for generating the computational experiments that are selected.
There has been some interesting work on automating specification search via expert system technology. For many problems however, specification search may not be the proper approach to sampling. The strategy for sampling from an ensemble of computational experiments must be driven by the question being asked. I and RAND colleagues have been experimenting with approaches to question driven exploratory modeling. Policy decisions must often be made about systems that are profoundly uncertain. An honest sensitivity analysis of any model of such a system can often result in the conclusion that all outcomes are possible. This answer is not very helpful to policy makers. As an alternative, we have been experimenting with oversampling from ensembles of models in the vicinity of thresholds where the final policy recommendation would come out differently. More generally, we are interested in using the logical structure of the argument by which a decision will be made to indicate the critical computational experiments to perform. Thus, what is called for is not the best estimate model, but the models whose results give the best leverage in reaching a decision.
As a simple example of another protocol, one might establish a threshold goodness of fit, such that any model performing that well or better would be considered "interesting". It then becomes reasonable to ask for the distribution of models that perform within that threshold. The strategy for sampling from an ensemble of models in order to assess the distribution of "interesting" models must clearly differ from that employed to seek a "best" model. If we can establish a topology on the ensemble of models such that nearby models have similar residual errors, then the "interesting" models will tend to lie in contiguous regions. In that case, one might choose to oversample in the vicinity of region boundaries in order to establish their location as accurately as possible given computational limits. Even where such a topology cannot be established, a diffuse Monte Carlo sample to estimate the range of models that satisfy the threshold criterion might in some situations be preferred to the discovery of a single model with small residual error.
Suppose, for example, we are analyzing financial time series in order to make wise investment decisions. "Betting the farm" on the predictions of a model that best fit the series seems an imprudent choice. Instead, the inherent noise in the data suggests a threshold goodness of fit such that any model fitting this well might be deemed plausible. A desirable sampling strategy might be one that generated the greatest diversity of alternative plausible futures. These could then be used to construct a mixed investment strategy that maximized safety as well as gain.
Once an ensemble of models is syntactically represented and a search or sampling strategy encoded as an algorithm, individual computational experiments can be generated by the computer. Depending on the complexity of the individual experiments, thousands, millions, or even billions of computational experiments could be generated at the push of a button. Further, when multiple processors are available, there is no reason that computational experiments need to be done one at a time. Combined with visualization tools that allow one to skim the results, this creates the possibility of compound computational experiments. Such compound experiments ask the question, "what do I find if I model this data, with that formalism, using that model sampling strategy."
Computer driven specification search is already becoming a reality. For example, there is a small industry of researchers applying genetic algorithms to search across ensembles of regression equations, neural networks, or other model formalisms, in search of order in financial market data. That this sort activity is at present often uninformed by statistical theory, does not change its potential impact on the way computer based analysis is conducted. As computer power continues to grow, we may find that the level of abstraction at which humans most effectively interact with the machines will grow as well. By compounding computational experimentation to the limits of available computing resources, the machines can be kept busy, freeing the user to ask ever more global questions.
The deep challenges of statistical theory will not be eliminated by computational plenty, but rather, transformed. For example, problems associated with Occam's Razor have generally been described with regard to a particular model fitted to data. In an era of cheap computational experiments, this issue clearly rests in the selection of search or sampling strategies and in the proper interpretation of the results of an exploratory analysis. Standard tests for significance (including cross validation tests) are typically applied to the performance of single models against the data, and do not take into account the specification search that resulted in this particular model being tested. This was of little consequence when specification search was human mediated, and only a handful of models could be tested. Once specification search becomes computer mediated, billions of such tests may be conducted, and significance tests applied to individual models can become misleading. Their recalibration to take into account the process of model specification is an important research topic.
Statistics arose from a pragmatic need to understand the implications of data. Computation is radically changing the practicalities of modeling and with it the importance of different directions for statistical research. In an era of computational experiments, improved protocols for their use in solving the real problems will prove to be a problem of central importance.
Further Reading
J. Adams, 1990, "Evaluating Regression Strategies", Proceedings of the Statistical Computing Section, American Statistical Association Meeting, August 1990.
S. Araki, 1991, "Dynamics of Planetary Rings" American Scientist, vol. 79, pp. 4459.
S. Bankes, 1993, "Exploratory Modeling for Policy Analysis", Operations Research, vol. 41, No. 3, pp. 435449.
D. Campbell, J. Crutchfield, D. Farmer, and E. Jen, 1985, Experimental Mathematics: The Role of Computation in Nonlinear Science, Communications of the ACM, vol. 28, pp. 374384.
E. Leamer, 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, John Wiley & Sons, New York.
> return to News, Press and Publications section
> up




