Guidance on applicationIn their report "Guiding Principles for Monte Carlo Analysis" (EPA, 1997) the EPA presents 16 good practice guidelines for doing Monte Carlo assessment. These guidelines are (we have modified the phrasing slightly to keep terminology consistent within the guidance documents):
Selecting Input Data and Distributions for Use in Monte Carlo Analysis
- Conduct preliminary sensitivity analyses or numerical experiments to identify model structures, model input assumptions and parameters that make important contributions to the assessment and its overall uncertainty.
- Restrict the use of probabilistic assessment to significant parameters.
- Use data to inform the choice of input distributions for model parameters.
- Is there any mechanistic basis for choosing a distributional family?
- Is the shape of the distribution likely to be dictated by physical or biological properties or other mechanisms?
- Is the variable discrete or continuous?
- What are the bounds of the variable?
- Is the distribution skewed or symmetric?
- If the distribution is thought to be skewed, in which direction?
- What other aspects of the shape of the distribution are known?
- Proxy data can be used to develop distributions when they can be appropriately justified.
- When obtaining empirical data to develop input distributions for model parameters, the basic tenets of environmental sampling should be followed. Further, particular attention should be given to the quality of information at the tails of the distributions.
- Depending on the objectives of the assessment and the availability of empirical data to estimate PDFs, expert elicitation can be applied to draft probability density functions. When expert judgment is employed, the analyst should be very explicit about its use.
Evaluating variability and knowledge limitations
- It is useful to distinguish between uncertainty stemming from intrinsic variability and heterogeneity of the parameters on the one hand and uncertainty stemming from knowledge limitations on the other hand. Try to separate them in the analysis where possible to provide greater accountability and transparency. The decision about how to track them separately can only be made on a case-by-case basis for each variable.
- Two dimensional Monte Carlo techniques allow for the separate treatment of variability and epistemological uncertainty. There are methodological differences regarding how uncertainty stemming from variability and uncertainty stemming from knowledge limitations are addressed in a Monte Carlo analysis.
- Variability depends on the averaging time, averaging space, or other dimensions in which the data are aggregated.
- Standard data analysis tends to understate uncertainty from knowledge limitations by focusing solely on random error within a data set. Conversely, standard data analysis tends to overstate variability by implicitly including measurement errors.
- Various types of model errors can represent important sources of uncertainty. Alternative conceptual or mathematical models are a potentially important source of uncertainty. A major threat to the accuracy of a variability analysis is a lack of representativeness of the data.
- Methods should investigate the numerical stability of the moments and the tails of the distributions.
- Data gathering efforts should be structured to provide adequate coverage at the tails of the input distributions.
- The assessment should include a narrative and qualitative discussion of the quality of information at the tails of the input distributions.
- There are limits to the assessor's ability to account for and characterize all sources of uncertainty. The analyst should identify areas of uncertainty and include them in the analysis, either quantitatively or qualitatively.
Presenting the Results of a Monte Carlo Analysis
- Provide a complete and thorough description of the model or calculation scheme and its equations, including a discussion of the limitations of the methods and the results.
- Provide detailed information on the input distributions selected. This information should identify whether the input represents largely variability, largely uncertainty, or some combination of both. Further, information on goodness-of-fit statistics should be discussed.
- A PDF plot is useful for displaying:
- The relative probability of values;
- The most likely values (e. g., modes);
- The shape of the distribution (e. g., skewness, kurtosis); and
- Small changes in probability density.
- A CDF plot is good for displaying:
- Fractiles, including the median;
- Probability intervals, including confidence intervals;
- Stochastic dominance; and
- Mixed, continuous, and discrete distributions.
- Provide detailed information and graphs for each output distribution.
- Discuss the presence or absence of dependencies and correlations.
- Calculate and present point estimates.
- A progressive disclosure of information style in presentation, in which briefing materials are assembled at various levels of detail, may be helpful. Presentations should be tailored to address the questions and information needs of the audience.
- Avoid excessively complicated graphs. Keep graphs intended for a glance (e. g., overhead or slide presentations) relatively simple and uncluttered. Graphs intended for publication can include more complexity.
- Avoid perspective charts (3-dimensional bar and pie charts, ribbon charts), pseudo-perspective charts (2-dimensional bar or line charts).
- Color and shading can create visual biases and are very difficult to use effectively. Use color or shading only when necessary and then, only very carefully. Consult references on the use of color and shading in graphics.
- When possible in publications and reports, graphs should be accompanied by a table of the relevant data.
- If probability density or cumulative probability plots are presented, present both, with one above the other on the same page, with identical horizontal scales and with the location of the mean clearly indicated on both curves with a solid point.
- Do not depend on the audience to correctly interpret any visual display of data. Always provide a narrative in the report interpreting the important aspects of the graph.
- Descriptive statistics and box plots generally serve the less technically oriented audience well. Probability density and cumulative probability plots are generally more meaningful to risk assessors and uncertainty analysts.
For a full discussion of these 16 guidelines we refer to the EPA report (EPA, 1997).
The EPA report also gives some guidance on the issue of constructing adequate probability density functions using proxy data, fitting distributions, using default distributions and using subjective distributions. Important questions in this process are:
- Is there Prior Knowledge about Mechanisms?
- Are the proxy data of acceptable quality and representativeness to support reliable estimates?
- What uncertainties and biases are likely to be introduced by using proxy data?
- How are the biases likely to affect the analysis and can the biases be corrected?
In identifying plausible distributions to represent variability, the following characteristics of the variable should be taken into account:
- Nature of the variable (discrete or continuous)
- Physical or plausible range of the variable (e. g., takes on only positive values)
- Symmetry of the Distribution. (E.g. is the shape of the distribution likely to be dictated by physical/ biological properties such as logistic growth rates)
- Summary Statistics (Frequently, knowledge on ranges can be used to eliminate inappropriate distributions; If the coefficient of variation is near 1.0, then an exponential distribution might be appropriate etc.)
|