Toolcatalogus

1.	Sensitivity analysis

2.	Error propagation equation ("TIER 1")

3.	Monte Carlo analysis ("TIER 2")

5.	Expert elicitation for uncertainty quantification

6.	Scenario analysis

7.	PRIMA: A framework for perspective-based uncertainty management

8.	Checklist for model quality assistance

9.	A method for critical review of assumptions in model-based assessments

Goals and use

This method aims to systematically identify, prioritise and analyse importance and strength of assumptions in the quantification of Environmental Indicators under various scenarios (such as in the Netherlands Environmental Outlook). These indicators are typically based on chains of soft-linked computer model calculations that start with scenarios for population and economic growth. The models in the chain vary in complexity. Often, these calculation chains behind indicators involve many analysts from several disciplines. Many assumptions have to be made in combining research results in these calculation chains, especially since the output of one computer model often does not fit the requirements of input for the next model (scales, aggregation levels). Assumptions are also frequently applied to simplify parts of the calculations. Assumptions can be made explicitly or implicitly.

Assumptions can to some degree be value laden. This method distinguishes 4 types of value-ladenness of assumptions: value-ladenness in a socio-political sense (e.g., assumptions may be coloured by political preferences of the analyst), in a disciplinary sense (e.g., assumptions are coloured by the discipline in which the analyst was educated), in an epistemic sense (e.g., assumptions are coloured by the approach that the analyst prefers) and in a practical sense (e.g., the analyst is forced to make simplifying assumptions due to time constraints).

The method can be applied by the analysts carrying out the environmental assessment. However, each analyst has limited knowledge and perspectives with regard to the assessment topic, and in consequence will have some 'blind spots'. Therefore preferably other analysts (peers) are involved in the method as well. Stakeholders, with their specific views and knowledge, can be involved as well. This can, for instance, be organised in the form of a workshop. The group of persons involved in the assumption analysis will be referred to as 'the participants'

The Method

The method involves 7 steps:

ANALYSIS

Identify explicit and implicit assumptions in the calculation chain
Identify and prioritise key-assumptions in the chain
Assess the potential value-ladenness of the key-assumptions
Identify 'weak' links in the calculation chain
Further analyse potential value-ladenness of the key-assumptions

REVISION

Revise/extend assessment
- sensitivity analysis key assumptions
- diversification of assumptions
- different choices in chain

COMMUNICATION

Communication
- key-assumptions
- alternatives and underpinning of choices regarding assumptions made
- implications in terms of robustness of results

All steps will be elaborated on below.

Step 1 - Identify explicit and implicit assumptions in the calculation chain

In the first step implicit and explicit assumptions in the calculation chain are identified by the analyst by systematic mapping and deconstruction of the calculation chain, based on document analysis, interviews and critical review. The resulting list of assumptions is then reviewed and completed in a workshop.

The aggregation level of the assumptions on the assumption list may vary. An assumption can refer to a specific detail in the chain ("The assumption that factor x remains constant"), as well as refer to a cluster of assumptions on a part of the chain ("Assumptions regarding sub-model x").

Step 2 - Identify and prioritise key-assumptions in the chain

In step 2 the participants identify the key-assumptions in the chain. The assumptions identified in step 1 are prioritised by taking into account the influence of the assumptions on the end results of the assessment. Ideally, this selection is based on a quantitative sensitivity analysis. Since such an analysis will often not be attainable, the participants are asked to estimate the influence of the assumptions on outcomes of interest of the assessment. An expert elicitation technique can be used in which the experts bring forward their opinions and argumentation on whether an assumption is of high or low influence on the outcome. Based on the discussion the participants then can indicate their personal estimate regarding the magnitude of the influence, informed by the group discussion. A group ranking is established by aggregating the individual scores.

Step 3 - Assess the potential value-ladenness of assumptions

To assess potential value ladenness of assumptions, a 'pedigree matrix' is used that contains criteria by which the potential value-ladenness of assumptions can be reviewed. The pedigree matrix is presented in Table 1 and will be discussed in detail later on.

For each key-assumption all pedigree criteria are scored by the participants. Here, again a group discussion takes place first, in order for the participants to remedy each other's blind spots and exchange arguments.

The order in which the key-assumptions are discussed in the workshop is determined by the group ranking established in step 2 of the method, starting with the assumption with the highest rank.

Step 4 - Identify 'weak' links in the calculation chain

The pedigree matrix is designed such that assumptions that score low on the pedigree criteria have a high potential for value-ladenness. Assumptions that, besides a low score on the criteria, also have a high estimated influence on the results of the assessment can be viewed as problematic weak links in the calculation chain.

Step 5 - Further analyse potential value-ladenness key-assumptions

In step 5, the nature of the potential value-ladenness of the individual key-assumptions is explored. Based on inspection of the diagrams visualizing the pedigree scores (or the table of pedigree scores), it can be analysed:

what types of value-ladenness possibly play a role and to what extent
to what extent there is disagreement on the pedigree scores among the participants
whether changing assumptions is feasible and desirable

Step 6 - Revise/extend assessment

Based on the analysis in step 5, it can be decided to change or broaden the assessment. As a minimum option, the assessment can be extended with a sensitivity analysis, which gives more information on the influence of weak links in the assessment.

Besides a sensitivity analysis, specific assumptions can be revised or diversified. In the case of revising an assumption, the assumption is replaced by a different assumption. In some cases however, it will be difficult or undesirable to choose between alternative assumptions, since there might be differing views on the issue. If these assumptions have a high influence on the assessment as a whole, it can be decided to diversify the assumptions: the calculation chain is 'calculated' using several alternative assumptions in addition to the existing ones. In this way several assessments are formed, with differing outcomes, depending on what assumptions are chosen.

Step 7 - Communication

It is important to be explicit about potential value-ladenness in the chain and the effects of potentially value-laden assumptions on the outcomes of the assessment. Analogous to a patient information leaflet, the presentation of the assessment results should be accompanied by information on:

what are the key-assumptions in the calculation chain
what are the weak links in the chain
what were the alternatives and what is the underpinning of the choices that were made regarding assumptions
what is the robustness of the outcomes of interest in view of the key assumptions

The Pedigree Matrix

The pedigree matrix for assessing the potential value-ladenness of assumptions is presented in Table 1. For a general introduction to the concept of pedigree matrix, we refer to the description of the NUSAP system in this tool catalogue. The criteria are discussed below.

Type of value-ladenness

Score

Criteria

Practical

Influence situational limitations

choice assumption hardly influenced

choice assumption moderately influenced

totally different assumption had there not been limitations

Epistemic

Plausibility

the assumption is plausible

the assumption is acceptable

the assumptions is fictive or speculative

Epistemic

Choice space

hardly any alternative assumptions available

limited choice from alternative assumptions

ample choice from alternative assumptions

Disciplinary, epistemic

Agreement among peers

many would have made the same assumption

several would have made the same assumption

few would have made the same assumption

Socio-political

Agreement among stakeholders

many would have made the same assumption

several would have made the same assumption

few would have made the same assumption

Socio-political

Sensitivity to view and interests of the analyst

choice assumption hardly sensitive

choice assumption moderately sensitive

choice assumption sensitive

Influence on outcomes of interest

the assumption has little influence on the outcome of interest

the assumption has a substantial influence on an intermediate variable but/or has moderate influence on the outcome of interest

the assumption has a large influence on the outcome of interest

Table 1: Pedigree matrix for the assessment of the potential value-ladenness of assumptions

Influence of situational limitations

The choice for an assumption can be influenced by situational limitations, such as limited availability of data, money, time, software, tools, hardware, and human resources. In absence of these restrictions, the analyst would have made a different assumption.

Although indirectly these limitations might be of a socio-political nature (e.g., the institute the analyst works for has other priorities and has a limited budget for the analyst's work), from the analyst's point of view these limitations are given. It can therefore be seen as primarily producing value-ladenness in a practical sense.

Plausibility

Although it is often not possible to assess whether the approximation created by the assumption is in accordance with reality, mostly an (intuitive) assessment can be made of the plausibility of the assumption.

If an analyst has to revert to fictive or speculative assumptions, the room for epistemic value-ladenness will often be larger. To some extent a fictive or speculative assumption also leaves room for potential disciplinary and socio-political value-ladenness. This is, however, dealt with primarily in the criteria 'agreement among peers' and 'agreement among stakeholders' respectively.

Choice space

The choice space indicates to which degree alternatives were available to choose from when making the assumption. In general, it can be said that a large choice space leaves more room for the epistemic preferences of the analyst. Often, the potential for value-ladenness in an epistemic sense is larger in case of a larger choice space. A large choice space will to some extent also leave more room for disciplinary and socio-political value-ladenness. These are however primarily dealt with in the criteria 'agreement among peers' and 'agreement among stakeholders' respectively.

Agreement among peers

An analyst makes the choice for a certain assumption based on his or her knowledge and perspectives regarding the issue. Other analysts might have made different assumptions. The degree to which the choice of peers is likely to coincide with the analyst's choice is expressed in the criterion 'agreement among peers'. These choices may be partly determined by the disciplinary training of the peers, and by their epistemic preferences. This criterion can thus be seen connected to value-ladenness in a disciplinary sense and in a epistemic sense. [12]

Agreement among stakeholders

Stakeholders, though mostly not actively involved in carrying out assessments, might also choose a different assumption in case they were asked to give their view. The degree to which it is likely that stakeholders agree with the analyst's choice is expressed in the criterion 'intersubjectivity among stakeholders'. This will often have to do with the socio-political perspective of the stakeholders on the issue at hand and this criterion can therefore be seen as referring to value-ladenness in a socio-political sense.

Sensitivity to view and interests of the analyst

Some assumptions may be influenced, consciously or unconsciously, by the view and interests of the analyst making the assumption. The analyst's epistemic preferences, and his cultural, disciplinary and personal background may influence the assumption that is eventually chosen. The influence of the analyst's disciplinary background on the choices regarding an assumption and the influence of his epistemic preferences are taken into account in the criteria 'agreement among peers', 'plausibility' and 'choice space'. In this criterion the focus is on the room for value-ladenness in a socio-political sense.

Influence on results

In order to be able to pinpoint important value-laden assumptions in the calculation chain it is not only important to analyse the potential value-ladenness of the assumptions, but also to assess the influence on the end result of the assessment. Ideally, a sensitivity analysis is carried out to assess the influence of each of the assumptions on the results. In most cases, however, this will not be attainable because it requires the building of new models. This is why the pedigree matrix includes a column 'influence on results'.

The modes for each criterion are arranged in such a way that the lower the score, the more value-laden the assumption potentially is.

[12] There is a link to controversy, as not all peers would agree to the same assumption if there was controversy regarding the issue of the assumption. However, if the majority of peers would choose the same assumption, still the score would be 2 ('many peers would have made the same assumption'). The occurrence of controversies in the scientific field thus is not always visible in the score. Reasoned the other way around, a score of 0 ('few peers would have made the same assumption') does not imply that there are controversies surrounding the assumption: it is possible that all peers agree on the issue, yet that the analyst for some reason has chosen a different assumption. The same applies to the criterion 'agreement among stakeholders'.

Visualising pedigree scores

When all participants have scored the assumptions on the criteria, the scores can be presented in a table. In order to facilitate a quick overview of the results, diagrams can be used that aggregate the scores of the individual experts without averaging them, and in such a way that expert disagreement on the scores is visualised.

One diagram is made for each assumption. The diagram is divided into 6 triangular segments, each segment representing one criterion (fig 1). The scale in each segment is such that zero is in the center of the diagram and two on the border. For each criterion, the area of the corresponding segment from the center of the diagram up to the minimum score given in the group is colored green. If there is no consensus on the score for a given criterion, the area in each segment spanned up between the minimum and the maximum score in the group for that criterion is colored amber. The remaining area (from the maximum score to the outside border of the diagram) -if any- is colored red.

Figure 1 Left: an example diagram. Right: explanation of the colours. Note that this example used a 3 point scale. Based on our experiences we recommend to use a 5 point scale as in table 1 (Kloprogge et al. 2004).

The convention follows a traffic-light analogy and is such that would an assumption on all criteria score 0 unanimously, the entire diagram will be red. If scores are better, more and more green comes into the diagram, whereas expert disagreement on scores is reflected in amber. On the other extreme, if an assumption scores 2 unanimously for all criteria, the entire diagram will be green. The scores for each criterion are such that in all cases more green in the diagram corresponds to lower potential value-ladenness and more red to higher potential value-ladenness.

A further nuance has been made to account for outliers: in some cases a single outlier score in the group distorts the green area in the diagram. In these cases, a light-green area indicates what the green area would look like if that outlier were omitted.

By looking at the red areas, the extent to which the different types of value-ladenness may have played a role in the production process of the assumption can be assessed. Green areas indicate that the participants think value-ladenness with regard to the criteria at hand played a small role in the production process, red areas that they think value-ladenness played a large role. In case of orange areas it can be concluded that there is disagreement among the participants on these matters.