Cost-Effectiveness Modeling


There has been a growing use of quality-adjusted life-years (QALYs) in the assessment of the cost-effectiveness of healthcare interventions. There are now many agencies around the world using evidence on the incremental cost per QALY to inform reimbursement decisions or clinical guidelines. The QALY provides a metric for valuing the impact of healthcare interventions on survival and health-related quality of life (HRQL) on a common scale. It achieves this by assigning a utility value for each health state on a scale where 1 is for full health and 0 for dead, with the possibility of negative values for states regarded as worse than dead. There are many different ways for deriving such health state utility values (HSUVs). At the same time, there has been an increasing use of decision-analytic models to provide the main vehicle for conducting the assessment of cost-effectiveness. These overcome the limitations of relying on single clinical trials, which often do not use measures for generating HSUVs, have a limited sample size (particularly for some rare events), insufficient follow-up periods, an unrealistic protocol and setting, and may be difficult to generalize from. Models provide a means of combining evidence from a variety of sources on the clinical efficacy of the interventions, resource use, costs of resources, and HSUVs in a way that addresses the decision problem in a more relevant way than a clinical trial. HSUVs are a key parameter in such models. There is a separate article on the derivation of HSUVs and the different instruments used. This article is concerned with the methodological issues associated with using HSUVs in cost-effectiveness models.

There are many different types of models used to assess cost-effectiveness including decision trees, Markov models, and discrete event simulation. All seek to represent reality in terms of health states likely to be experienced by patients in the decision problem, transition probabilities between the states, and costs and utility value associated with each state.

The states may be defined in different ways including whether a patient has a condition, severity of condition, key events (e.g., fractures in the case of osteoporosis), adverse events, and various comorbidities. These events may occur multiple times and there may be cases of multiple conditions. This article addresses four sets of methodological issues around the use of HSUVs to populate such cost-effectiveness models. (1) The selection of the measure for generating the HSUVs that best meets the requirements of policy makers and measurement criteria like validity. (2) The source of HSUV data such as the main clinical efficacy trials or whether to seek more relevant values for the model population from observational datasets, or to search, review, and synthesize an ever-growing literature. (3) Suitable utility data using the required measure may not be available from relevant studies, and in these cases regression techniques may be used to map from various health or clinical measures onto the selected utility measure. (4) Technical problems in using HSUVs in cost-effectiveness models, including how to adjust values over time, estimate values for those not in the condition of interest, and estimate the impact of conditions (comorbidities) or adverse events. This article considers the technical issues alongside the common requirements of policy makers around the world. Many of the decisions are not technical ones alone but involve normative judgments that in many cases will be made by policy makers requiring cost-effectiveness evidence. This is intended to be a practical guide aimed at analysts who are building cost-effectiveness models.

What Measures Should Be Used?

There are four broad approaches for generating HSUVs: Generic preferences -based measures (also known as multiattribute utility instrument), condition-specific preferences -based measures, bespoke vignettes, and patient’s own valuation. The most widely used of these in recent years has been the generic preferences -based measure of health. These measures have two components. One is a descriptive system that is composed of several multilevel dimensions. For example, the EQ-5D has five dimensions (mobility, self-care, usual activities, pain and discomfort, and anxiety and depression) each with three levels (a five-level version has recently been developed) and defines 243 health states. Each one of these states has a value on the QALY scale that was obtained by interviewing a sample of the general population. This descriptive system is usually completed by patients or their proxies in clinical studies and so provides a direct link between QALY estimates and the reported experiences of patients. By collecting EQ-5D or some other measures over time, it is possible to calculate the QALY gain in a trial setting (as the area under the curve) or to value states used in the model from observing patients in different clinical states.

These generic measures are designed for use in all conditions and patients. However, there are concerns that no one measure is sensitive or relevant to all conditions or patient groups. For this reason condition-specific preferences -based measures have been developed by Brazier and colleagues. The problem with condition-specific measures is a concern with the lack of comparability between different instruments. This will be a problem where the model contains states from different conditions, as is often the case, and where the policymaker is making resource allocation decisions between conditions. Another approach has been to develop specific vignettes where there is no patient-reported information on the impact of a condition or its treatment. These vignettes can be specifically designed to describe the states in the model. However, in addition to the concern about comparability, vignettes do not have a direct link to evidence on patient experience that is achieved by the other two approaches, because they are not based on patient completion of a descriptive system but usually involve the views of experts (all be it informed by patient experience). The final approach avoids having to describe health states altogether and instead ask patients to value their own state using one of the preferences elicitation techniques, such as time trade-off. Most agencies prefer health states to be described by patients and then valued by members of the general public, but one or two have specifically requested valuations directly from patients and this approach continues to be used.

A key problem is that these different approaches to valuing health produce different values. Indeed, different generic instruments have been shown to generate HSUVs that differ to a significant degree. The selection of instrument will have important implications for the incremental cost-effectiveness ratio. There is a literature on how to select the right measure in a given case, and this considers issues around the validity of the descriptive system for the condition, valuation methods, and source of the values. The decision about the right measure should not only consider these issues but will also be constrained in some cases by the policy makers to whom the model is going to be submitted. Some agencies have adopted a Bibliography: case that includes a preferred measure or approach. The most prescriptive has been the National Institute for Health and Clinical Excellence (NICE) in England who state a preferences for the EQ-5D, and those submitting evidence need to demonstrate the EQ-5D is not appropriate in order to submit cost-effectiveness models using HSUVs from other measures. In some other countries, there is merely a preferences for a generic measure. In others still, there is no preferences expressed as to which type of measure should be used.

The final choice of measure used to derive the HSUVs will depend on some combination of the requirements of the policymaker, psychometric and other criteria, and also availability. In many cases, there is very limited evidence on HSUVs from a preferred measure or approach, and the analyst must make best use of available evidence. This may include the use of non-Bibliography:-based measures of health or clinical measures through the use of mapping (see Section Predicting Health State Utility Values When preferences -Based Data are Not Available). It will increasingly involve reviewing a range of possible sources including trials, observational and routine datasets, and the literature.

Source Of Health State Utility Values

Clinical Trials

An appropriate source for the data on HSUVs may be the main clinical trial(s) used to inform the evidence on effectiveness. This enables the trial data to be used directly within the analysis of HRQL, eliminates concerns about the applicability of the health data to populations from which the effectiveness estimates are obtained and enables all the effects of treatment to be included directly in the estimate, including any side effects of treatment, without the need for adjustment. However, there may be concerns about the generalizability of effectiveness and/or HRQL data to the population in the model. There may be other circumstances where health state utility data are not best collected within the clinical trials, for example, if adverse events related to the condition or treatment are rare and not likely to be captured in the trials, or where the outcomes of interest are too long-term to be captured in a typical trial duration, or when the trial does not reflect common practice. In these circumstances observational studies may be more appropriate for capturing the impact of the event on HRQL.


HSUVs are often sourced from observational sources conducted for the purpose. Such tailored studies have the advantage of being designed for the purpose of populating a specific model and so can be designed to value the specific states defined in the model. However, this will often not be possible. Another data source is routine datasets such as general population health surveys (e.g., Medical Expenditure Panel Survey in USA and Health Survey for England in England) or routine surveys of patient-reported outcomes (e.g., the UK Patient Reported Outcome Measures program). For any observational source a key concern will be the extent to which HSUVs are caused by the condition. Patients who had a recent fracture, for example, have a lower score than those who do not. However, the differences found from cross-sectional observational studies tend to exaggerate the impact of hip fracture because they often do not take into account their prefracture health status. As for evidence on efficacy, longitudinal evidence is better evidence than cross-sectional, as the impact of specific events or disease onset can be controlled for covariates.

Reviewing The Literature

There are published lists of HSUVs for a wide range of conditions and this literature is growing all the time. There is a risk that model builders will be tempted to use the first suitable value or even use those values that support the cost-effectiveness argument that is being made in a submission to a reimbursement authority. The larger the literature, the more prone the selection of values is to bias. For this reason, it is beholden on analysts to justify their selection of values. This implies a need for HSUVs, like other important model parameter values, be obtained from a systematic review of the literature in order to minimize bias and through appropriate synthesis of available values, capture the uncertainty, and improve the precision in the values used.

There are rarely the resources available to do a full systematic review in searching, reviewing, and synthesizing the evidence. Furthermore, reviewing HSUV studies is different from the conventional hierarchy of evidence used for clinical effectiveness. Simply looking for HSUVs from a search for efficacy evidence will fail to retrieve many, if not most, published HSUVs for the health states in the model because randomized controlled trial are often not the main important source for HSUVs and the models may include other conditions and adverse events. A model examining the cost-effectiveness of strategies for managing osteoporosis had states for various factures (e.g., hip, vertebra, and shoulder), breast cancer, coronary heart disease, and no event. A systematic literature review by Peasgood and others on the impact of osteoporosis fractures identified 27 articles from an initial set of 1000 papers reporting potentially relevant HSUVs for the model. As can be seen in Figure 1, there is a substantial difference in the HSUVs reported for the same time periods and although there is a trend for recovery following hip fracture, none achieve the prefracture values, and one study reported a decline in HSUVs over a period of 4–17 months.

Cost-Effectiveness Modeling Figure 1

The key considerations in searching and reviewing HSUVs are: (1) Do the HSUVs meet the methodological requirements of the policymaker – in the case of NICE, the focus may be on obtaining EQ-5D values (using the UK tariff of values), (2) have the HSUVs been obtained from a population relevant to the population in the model (e.g., in terms of severity of condition, age, and gender), and (3) what is the quality of the study including recruitment and response rates? These considerations do not operate in a dichotomous way because the analyst is looking for the best estimates and not necessarily the perfect ones, and these requirements may be relaxed depending on the available evidence base. Concerns about the relevance or quality of data should be fully explored in the cost-effectiveness model through the use of sensitivity analyses.

There are a number of search strategies for identifying HSUVs. However, a full search of the literature may yield many hundreds of values, and so the reviewer may wish to use more focused search strategies limited to identifying existing reviews or key papers and following up Bibliography: in those articles, as described by Papaioannou.

For many conditions, there are a large number of HSUVs available in the literature and considerable variation in the values for what seem to be similar states. A review of values for use in a cost-effectiveness model of osteoporosis, for example, found values for hip fracture to vary from 0.28 to 0.72 and vertebral fracture from 0.31 to 0.8. This leaves considerable scope for discretion in the selection of values for an economic model. The variation was partly due to differences in methods. In this example, the values were limited to EQ-5D for populating the cost-effectiveness model because the submission was for NICE. The values still varied considerably between studies. This may have been due to the different source countries, with much of the data coming from Sweden. It may also have been due to the very low response rate in some studies. There has been little research into the synthesis of HSUVs using techniques similar to those used for clinical efficacy including simple pooling or metaregression, but such work is at an early stage and the number of studies available for given conditions tend to be too small and heterogeneous. For this reason, current practice often involves selecting the study, which provides the most relevant values.

In practice, there may be little or no relevant HSUVs available for the cost-effectiveness model, but there may be trials or observational datasets that have collected HRQL or clinical data on relevant patients. The next section considers an increasingly used solution to this problem of mapping the relationship between the HRQL or clinical measure and the required preferences -based measure.

Predicting Health State Utility Values When Preferences -Based Data Are Not Available

When the required preferences -based utility measure is not collected in the clinical effectiveness studies or any relevant observational source, a mapping exercise can be undertaken to predict the required values (e.g., EQ-5D) from an alternative HRQL or clinical measure collected in the key study or studies. This exercise (Figure 2) requires an external dataset, which includes both the preferred preferences -based data (the dependent variable (DV)) and at least one other variable (the independent variable (IV)) that is also available from the key clinical effectiveness or observational study. The data in the external dataset are used to obtain a statistical relationship, known as a statistical regression model, which can then be used to predict the required preferences -based utility scores using the data available from the clinical effectiveness study.

Cost-Effectiveness Modeling Figure 2

The statistical regression model can take many different forms depending on the relationship between the variables and the underlying distributions of the data. The simplest model is a straight linear function (y = a+βx+e) where y is the DV (the preferences -based HSUVs), a is the intercept, b is the vector of coefficients for the IVs, and e is the error term. These regression models can be used to predict the DV in any datasets, which include the IVs. If some of the IVs are missing from the second dataset, the mean values from the external dataset used to obtain the statistical relationship can be used as proxies.

Using Clinical Variables And Progressive Conditions

Statistical regression models are also used to determine relationships between clinical variables and preferences -based utility values when the cost-effectiveness models are driven by clinical variables, which represent stages or progression in the primary health condition. In these instances it may be that, although the clinical effectiveness study collects the required preferences -based data, the distribution of patients across disease severity is such that the subgroup sizes are too small to determine HSUVs for each of the individual stages of the condition. For example, ankylosing spondylitis is a chronic progressive condition, and the severity of the condition is described using two clinical measures: the Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) and the Bath Ankylosing Spondylitis Functional Index (BASFI). Both measures range from 0 to 10, which represent no disease activity or functional impairment and maximum disease activity or functional impairment, respectively. Figure 3 shows how the preferences -based utility values (the EQ-5D) vary by BASDAI and BASFI scores using the function: EQ·5D = 0.9235-0.0402 * BASDAI – 0.0432 * BASFI, which was obtained using ordinary least square regressions.

Cost-Effectiveness Modeling Figure 3

Figure 4 shows the BASDAI/BASFI profile (primary y-axis) and the corresponding EQ-5D values (secondary y-axis) plotted over time (x-axis) as would be used in a cost-effectiveness model. The figure shows individuals enter the model with average BASDAI/BASFI scores of seven units at baseline (time = 0). They initially respond to treatment, and their BASDAI/BASFI score improves to an average score of 4. After 4 years they stop responding to treatment and their BASDAI/BASFI scores revert to the baseline score of 7. These scores gradually worsen as the condition progresses until reaching the maximum possible score (BASDAI/BASFI equals 10) at 17 years. The BASDAI/BASFI scores remain at these levels until the patient dies (time = 26 years). Using the function described earlier to predict EQ-5D values from the BASDAI and BASFI scores, the predicted EQ-5D values are 0.241 (0.544, 0.241, – 0.062, and 0) at baseline (4–7 years, 17 years, 26 years, and after 26 years).

Cost-Effectiveness Modeling Figure 4

Multiple Health States

For a simple cost-effectiveness model involving few health states, the mean (and variance) preferences -based utility values for each of the health states may be sufficient to describe the average HRQL and associated uncertainty for the health condition. However, when the cost-effectiveness model includes numerous distinct health states and additional predictors of health status, a statistical regression model and associated covariance matrix can be used to ensure correlations between preferences -based utility values and are maintained when exploring uncertainty in the probabilistic sensitivity analyses.

One example is a cost-effectiveness model exploring the potential benefits of pharmaceutical interventions used to induce weight loss in obese patients. Obese patients are at increased risk of comorbidities (e.g., type 2 diabetes, cancer(s), heart attacks, strokes, etc. Figure 5) and the effectiveness of interventions are quantified in terms of changes in body mass index. To model this, analysts would need HSUVs for each of the comorbidities differentiated by body mass index and potentially age and/or gender. It is unlikely that this level of detailed information for each of the different subgroups would be available from clinical effectiveness studies. In this case, a statistical regression model obtained from a large external dataset could be used to predict the values required for each of the health states in the cost-effectiveness model.

Cost-Effectiveness Modeling Figure 5

Double Mapping

There are occasions when it is not possible to obtain an external dataset which includes both the required preferences based utility measure and one or more of the variables collected in the clinical study. In these instances, although not ideal, it is possible to obtain preferences -based utility values using a process known as ‘double mapping’. Double mapping involves the use of two external datasets and one statistical regression model obtained from each of these.

For example, in patients with psoriatic arthritis, a chronic progressive condition, the clinical study did not collect HRQL data but did collect information on demography (age, gender, and current and previous pharmaceutical treatments). In the cost-effectiveness model, the Health Assessment Questionnaire (HAQ: range 0–3, 3 = worse) was used to describe both the initial benefits of treatment and the long-term progression of the condition. Two external datasets were available (Figure 6). The first dataset (external dataset 1) had data on demography (age, gender, and current and previous pharmaceutical treatments) and HAQ but did not have any HRQL data. The second dataset (external dataset 2) had HAQ and the required preferences -based data (EQ-5D) but did not have data on demography. The cost-effectiveness model required a relationship, which would link HRQL data to HAQ, the clinical variable, which would describe the benefits of treatment and long-term progression of the condition.

Cost-Effectiveness Modeling Figure 6

The process used to predict EQ-5D scores in the cost-effectiveness model is described in Figure 6. Step 1: External dataset 1 was used to obtain a statistical regression model 1 mapping demography (age, gender, and pharmaceutical treatments) onto HAQ. Step 2: The statistical regression model 1 was used to predict HAQ using the data on demography (age, gender, and pharmaceutical treatments) in the clinical study. Step 3: External dataset 2 was used to obtain the statistical regression model 2 mapping HAQ onto EQ-5D. Step 4: The predicted HAQ scores from the clinical study were used to predict EQ-5D in the cost-effectiveness model using the statistical regression model 2.

Predictive Ability

Ideally, any statistical model would be validated in an external dataset before use in a cost-effectiveness model. However, in the majority of cases, regressions are performed because the actual data are not available in a particular group, and therefore it is not possible to validate results in this way. Regression models, which have HRQL measures as the DV, typically underestimate and overestimate values at the top and bottom of the index, respectively. Consequently, it is important to demonstrate the accuracy in the predicted values across the full range of the index. If the objective of the regression is to obtain a model to predict values in cost-effectiveness model, then it is also useful to assess the ability of the regression model to predict incremental values accurately. The predicted values are typically assessed using the mean absolute error and root mean squared error. However, these summary scores can mask inaccuracies at the extremes of the index, and the predicted values should be assessed by subgrouping across the range of actual values.

Applying Health State Utility Values In Cost-Effectiveness Models

Baseline Or Counterfactual Health States

Decision-analytic models in healthcare typically assess the benefits of interventions in terms of the incremental QALY gain associated with alleviating a health condition or avoiding a clinical event. To calculate this, in addition to requiring the HSUVs associated with the condition or event, the analyst will also need the baseline or counterfactual HSUVs to represent the HRQL associated with not having the particular health condition or event. For example, if modeling the benefits of introducing a screening program for breast cancer, analysts would require the mean HSUVs from a cohort with a history of breast cancer (including longer term data to model any potential changes in HRQL as the condition progresses) and the mean HSUVs for patients who do not have breast cancer. Similarly, when modeling an intervention that has the potential to avoid subsequent cardiovascular events in patients with acute coronary syndrome, for example, a stroke, the analyst would need to know the mean HSUV for patients who have experienced a stroke and the mean HSUVs for individuals who have not experienced a stroke but have a history of acute coronary syndrome.

A patient without a particular condition is unlikely to have an HSUV of one. A better approach would be to use a normative dataset. Furthermore, the values of those with a condition are likely to change over time. HSUVs from the general population, for example, show a negative relationship with age (i.e., as age increases, the average HRQL decreases, Figure 7). This is due to several factors such as general decline in health directly attributable to age and an increase in prevalent health conditions, which are in general correlated with age. As many cost-effectiveness models use a lifetime horizon to accrue the costs and QALYs associated with interventions, it is reasonable to assume that the baseline or counterfactual HSUVs within the model may not remain constant over the full horizon modeled. Although there is a substantial volume of HSUVs in the literature describing the HRQL for specific health conditions, corresponding data for individuals without a specific health condition are more difficult to obtain without access to huge datasets. Unless the health condition is particularly prevalent, or unless it has a substantial effect on HRQL, removing a cohort who has a specific health condition will not have a substantial effect on the mean HSUVs obtained from the general population. In many instances, if the condition-specific baseline data are not available, it is possible to use data from the general population as proxy scores to represent the baseline or counterfactual HSUVs in the decision-analytic model.

Cost-Effectiveness Modeling Figure 7

Adjusting Or Combining Health States

Healthcare decision-analytic models depict the typical clinical pathway followed by patients in normal clinical practice. As such they can become quite complex involving multiple health states, which represent the primary health condition with additional health states representing either comorbidities (e.g., when an additional condition exists concurrently alongside the primary condition), or an adverse event associated with the intervention or treatment (e.g., nausea is a side effect of treatments for cancer, whereas patients receiving aspirin for hypertension are at increased risk of hemorrhagic strokes). Ideally, each individual health state within a decision-analytic model would be populated with HSUVs obtained from cohorts with the exact condition defined by the health state. For example, it has been demonstrated that statins, which are typically given to manage cholesterol levels in patients with or at high risk of cardiovascular disease, have a beneficial effect on inflammation, thus may provide an additional benefit in patients with rheumatoid arthritis. To assess the benefits of statin treatment in a cohort with both cardiovascular disease and rheumatoid arthritis, the analyst would need HSUVs obtained from patients with both these conditions. However, many clinical effectiveness studies use very strict exclusion criteria relating to comorbidities and/or concurrent medications. As a consequence, the people who represent typical patients with comorbidities are excluded from studies, and analysts frequently combine the mean data obtained from cohorts with the single conditions to estimate the mean HSUVs for a cohort with more than one condition.

The methods used to combine the data can have a substantial effect on the results generated from decision-analytic models, and it has been shown that the result can vary to such an extent that they could potentially influence a policy decision, which is based on a cost per QALY threshold. There are a number of different ways to estimate the mean HSUV for the combined health condition using the mean HSUVs from the single health conditions. Traditional techniques include the additive, multiplicative, and minimum methods. The first two apply a constant absolute and relative effect respectively, whereas the latter ignores any additional effect on HRQL associated with the second health condition, using the minimum of the mean HSUVs obtained from cohorts with the single conditions as shown in Figure 8. Additional methods that have recently been tested include exploring the possibility of regressing the mean HSUVs from cohorts with single conditions onto the mean HSUVs from cohorts with comorbidities using ordinary least square regressions. Although this research is in its infancy, the early results look promising. However, based on the current evidence base, researchers recommend that the multiplicative method is used to estimate HSUVs for comorbidities, using an age-adjusted baseline as a minimum when calculating the multiplier used.

Cost-Effectiveness Modeling Figure 8

Worked Example

Females with condition A have a mean EQ-5D score of 0.69 and a mean age of 73 years, and females with condition B have a mean EQ-5D score of 0.70 and a mean age of 80 years. Using the data from the general population (Figure 8) as the baseline, these data are combined to determine what the EQ-5D score is for females with both condition A and condition B. Using data from the general population, at the age of 73 years and 80 years, the mean EQ-5D score for females is 0.7550 and 0.7177, respectively. The multipliers for conditions A and B are 0.9138 ( = 0.69/0.7550) and 0.9754 ( = 0.70/0.7177). The baseline data are then adjusted using these multipliers to estimate the age-adjusted EQ-5D score for the combined conditions A and B as shown in Figure 8.

Adverse Events

When considering adverse events for inclusion in cost-effectiveness models, it is important to distinguish between acute events and chronic sequelae. Although the inclusion of decrements on HRQL associated with grade 3–4 adverse events is particularly important, the cohort used for the main HSUVs may have included a proportion of patients who had experienced grade 1–2 adverse events and care should be taken to ensure these are not double counted. As in the preceding section, treating the decrement associated with the adverse event as a constant value may be inappropriate and based on the current evidence, the HSUVs should be multiplied (adjusting for age wherever possible) when combining these data.


All results generated from cost-effectiveness models used to inform policy decision making in healthcare are subject to uncertainty. The uncertainty is examined and reported using sensitivity analyses. One-way sensitivity analysis is a procedure in which the central estimates for key parameters in the model are varied one at a time (generally using the 95% confidence intervals) and inform readers which variables drive the results generated by the model. Probabilistic sensitivity analysis is a method of varying all variables simultaneously to assess the overall uncertainty in the model. The individual Monte Carlo simulations (e.g., 5000) are generated using random numbers to sample from the distributions of the parameters. New results are generated by the model and each of the 5000 results stored. The recorded results are then used to illustrate the overall variability in the model results.

Figure 9 shows a scatter plot of the incremental costs (y-axis) and incremental QALYs (x-axis) generated from a cost-effectiveness model. The red points represent the individual results generated when there is relatively little uncertainty in the parameters used in the model. The blue symbols represent the individual results generated when there is considerable uncertainty and thus cover a broader area. The mean results (d24 500 per QALY) are the same in the results that are relatively uncertain and the results that are associated with a higher level of confidence. Using a cost per QALY threshold of d30 000 per QALY (the diagonal line), 41% of results from the model, which has a high level of uncertainty, are greater than this threshold, compared to just 7% of results from the model with a smaller level of uncertainty.

Cost-Effectiveness Modeling Figure 9

When looking at the uncertainty associated with the HSUVs, the distribution used to characterize the variables for the probabilistic sensitivity analyses should be chosen to represent the available evidence as opposed to selected arbitrarily. HRQL data, in particular the preferences -based utility data, are generally not normally distributed. They are typically skewed, bimodal or trimodal, bounded by the limits of the preferences -based index, and can involve negative values representing health states consider to be worse than death. Despite this, in the majority of decision-analytic models, the uncertainty in the mean HSUV can be adequately described by sampling from a normal distribution. Exceptions to this rule include when conducting patient-level simulation models using data from cohorts with wide variations in HSUVs and a relatively low or high mean value. In these cases an alternative approach would be to describe the utility values as decrements from full health (i.e., 1 minus the HSUV) and then sample from a log normal or gamma distribution, which would give a sampled utility decrement on the interval (0, ∞). If a lower constraint is required (i.e., – 0.594 for the UK EQ-5D index), the standard beta distribution could be scaled upwards using a height parameter (λ) producing a distribution on a (0, λ) scale.


The use of HSUVs in cost-effectiveness models has not received much attention in the literature. However, there are often no relevant HSUVs to be found in the literature, observational sources, or even trials. This article has provided practical guidance to those seeking to build cost-effectiveness models. In the near future, it is expected that there will be further developments in the field including methods of mapping, the synthesis for HSUVs across studies, and in the measures themselves. Policymaker’s requirements may also change over time.


  1. Ara, R. and Brazier, J. E. (2010). Populating an economic model with health state utility values: Moving toward better practice. Value in Health 13(5), 509–518.
  2. Ara, R. and Brazier, J. E. (2011). Using health state utility values from the general population to approximate baselines in decision analytic models when condition-specific data are not available. Value in Health 14(4), 539–545.
  3. Ara, R. and Wailoo, A. (2011). The use of health state utility values in decision models. Decision Support Unit, Technical Support Document 12. Available at:
  4. Brazier, J., Ratcliffe, J., Saloman, J. and Tsuchiya, A. (2007). Measuring and valuing health benefits for economic evaluation (1st ed.). Oxford: Oxford University Press.
  5. Brazier, J. E., Rowen, D., Tsuchiya, A., Yang, Y. and Young, T. (2011). The impact of adding an extra dimension to a preferences -based measure. Social Science and Medicine 73(2), 245–253.
  6. Longworth, L. and Rowen, D. (2011). The use of mapping methods to estimate health state utility values. Decision Support Unit, Technical Support Document 10. Available at:
  7. Papaioannou, D., Brazier, J. and Paisley, S. (2010). The identification, review, and synthesis of health state utility values from the literature. Decision Support Unit, Technical Support Document 9. Available at:
  8. Papaioannou, D., Brazier, J. and Parry, G. (2011). How valid and responsive are generic health status measures, such as EQ-5D and SF-36, in schizophrenia? A Systematic Review. Value in Health 14(6), 907–920.
Budget-Impact Analysis
Decision Analysis