Toolcatalogus

1.	Sensitivity analysis

2.	Error propagation equation ("TIER 1")

3.	Monte Carlo analysis ("TIER 2")

5.	Expert elicitation for uncertainty quantification

6.	Scenario analysis

7.	PRIMA: A framework for perspective-based uncertainty management

8.	Checklist for model quality assistance

9.	A method for critical review of assumptions in model-based assessments

Goals and use

The goal of the Checklist for Model Quality Assistance is to assist in the quality control process for environmental modelling. The point of the checklist is not that a model can be classified as 'good' or 'bad', but that there are 'better' and 'worse' forms of modelling practice. The idea behind the checklist is that one should guard against poor practice because it is much more likely to produce poor or inappropriate model results. Further, model results are not 'good' or 'bad' in general (it is impossible to 'validate' a model in practice), but are 'more' or 'less' useful when applied to a particular problem. The checklist is thus intended to help guard against poor practice and to focus modelling on the utility of results for a particular problem. That is, it should provide insurance against pitfalls in process and irrelevance in application.

Large, complex environmental models present considerable challenges to develop and test. To cope with this, there has been a lot of effort to characterize the uncertainties associated with the models and their projections. However, uncertainty estimates alone are necessarily incomplete on models of such complexity and provide only partial guidance on the quality of the results. The conventional method to ensure quality in modelling domains is via model validation against observed outcomes. Unfortunately, the data are simply not available to carry out rigorous evaluations of many models (Risbey et al., 1996).

Lack of validation data is critical in the case of complex models spanning human and natural systems because they require: socio-economic data which has frequently not been collected; data related to value dimensions of problems that is hard to define and quantify; data on projections of technical change which must often be guessed at; data on aggregate parameters like energy efficiency which is difficult to measure and collect for all the relevant economies; geophysical data on fine spatial and temporal scales worldwide that is not generally available; data pertinent to non-marginal changes in socio-economic systems which is difficult to collect; and experience and data pertaining to system changes of the kind simulated in the models for which we have no precedent or access.

Without the ability to validate the models directly, other forms of quality assessment must be utilized. Unfortunately, there are few ready-made solutions for this purpose. For complex coupled models there are many pitfalls in the modelling process and some form of rigour is all that remains to yield quality. Thus, a modeller has to be a good craftsperson (Ravetz, 1971; 1999). Discipline is maintained by controlling the introduction of assumptions into the model and maintaining good `practice'. What is needed in this case is a form of heuristic that encourages self-evaluative systematisation and reflexivity on pitfalls. The method of systematisation should not only provide some guide to how the modellers are doing; it should also provide some diagnostic help as to where problems may occur and why.

Risbey et al., (2001) have developed a model quality assistance checklist for this purpose to be used in the project).

The philosophy underlying the checklist is that there is no single metric for assessing model performance and that, for most intents and purposes, there is no such thing as a `correct' model or at least no way to determine whether it is correct. Rather, models need to be assessed in relation to particular functions. Further, that assessment is ultimately about quality -- where quality relates a process/product (in this case a model) to a given function. The point is not that a model can be classified as `good' or `bad', but that there are `better' and `worse' forms of modelling practice, and that models are `more' or `less' useful when applied to a particular problem. The checklist is thus intended to help guard against poor practice and to focus modelling on the utility of results for a particular problem. That is, it should provide some insurance against pitfalls in process and irrelevance in application. The questions in the checklist are designed to uncover at least some of the more common pitfalls in modelling practice and application of model results in policy contexts. The output from the checklist is both indirect, via reflections from the modeller's self-assessment, and direct in the form of a set of potential pitfalls triggered on the basis of the modeller's responses.

The checklist is structured as follows. First there is a set of questions to probe whether quality assistance is likely to be relevant to the intended application. If quality is not at stake, a checklist such as this one serves little purpose. The next section of the checklist aims to set the context for use of the checklist by describing the model, the problem that it is addressing, and some of the issues at stake in the broader policy setting for this problem. The checklist then addresses `internal' quality issues, which refers to the processes for developing, testing, and running the model practiced within the modelling group. A section on `users' addresses the interface between the modelling group and outside users of the model. This section examines issues such as the match between the production of information from the model and the requirements of the users for that information. A section on `use in policy' addresses issues that arise in translating model results to the broader policy domain, including the incorporation of different stakeholder groups into the discussion of these results. The final section provides an overall assessment of quality issues from use of the checklist.

The automated version of the checklist also contains an algorithm to produce a list of pitfalls based on the answers given.