Health Econometrics

Empirical analysis of data describing relationships involving health – health econometrics – arises in a wide variety of important scholarly and policy contexts. The econometric analysis of data on topics as diverse as health insurance, substance use, provider behavior, chronic disease, evaluation, market structures, regulation, medical technologies, labor supply, and others is encountered routinely in every issue of leading field journals like the Journal of Health Economics, Health Economics, and others.

Reflecting the increased prominence of both conceptual and applied health econometrics research is an increasing array of professional activities devoted specifically to health econometrics. For over 20 years, researchers at the University of York and local sites all across the European Union have organized annual meetings on health economics and health econometrics. More recently, specialized health econometrics conferences and workshops have regularly been organized in the US, Italy, and elsewhere. Beyond these, sessions and preconference courses dedicated to health econometrics have been among the most popular and well-attended activities at meetings of major health economics organizations like the International Association of Health Economics, the American Society of Health Economists, and others.

The methods of health econometrics are deployed to address a wide variety of questions. At their essence, many are concerned with the estimation of treatment effects, broadly construed. These can arise in narrow small-N contexts like the evaluation of clinical interventions as well as in broad population or large-N contexts like the implementation of tax, regulatory, or other public policy interventions. Recent emphases on ‘comparative effectiveness’ and the empirical methods used to understand the relative value of interventions have underscored the importance of linking relevant decision-making contexts to reliable and robust analytical methods that can be deployed to inform such decisions.

How to deliver informative estimates of treatment effects in the light of observational data often utilized in the service of such questions is one of the central problems of applied health econometrics. Such observational data are now drawn from an increasingly wide set of sources: Population and community surveys, administrative data describing program participation, electronic medical records, and others. Regardless of the particular data, there is widespread recognition that many of the treatments at issue are endogenous with respect to the outcomes of interest (i.e., are correlated with unobserved determinants of such outcomes, known as ‘confounding’ in the epidemiological literature). To circumvent the problems that arise with endogenous treatments, quasi-experimental methods are often utilized. Instrumental variable methods, longitudinal or panel data analyses, and others are deployed with assumptions sufficient to generate consistent estimates of parameters of interest (whether the assumptions are reasonable and/or holds in the context of the particular study are separate but important questions). One such assumption is the correct specification of a model for the data at hand. Economic theory, or any other theory for that matter, often has a hard time predicting directions of covariate effects. It does not provide much guidance as to the appropriate functional form for the data at hand. Therefore, a good deal of health econometrics literature has focussed on ascertaining appropriate models using various goodness of fit measures. A good discussion of these issues can be found in the chapter by Manning on modeling healthcare expenditures that are known for their idiosyncracies. Appropriate specification of a model is then followed by identification of the parameter of interest, often a treatment effect parameter. Geographic variation in constraint sets has been one prominent identification strategy (Rosenzweig and Schultz, 1983), and indeed was – to our knowledge – the approach that introduced instrumental variable analysis to clinical and related audiences (McClellan et al., 1994, in the context of differential distance instruments). More recently, approaches like propensity score or control function methods have become popular in health services research even though the extent to which such methods fail to circumvent problems arising from confounding is often underappreciated.

In this context it is often useful to bear in mind that the ‘gold standard’ of the randomized clinical trial against which observational data analysis is frequently held is itself an emperor that often wears little clothing. Within-trial behaviors like attrition, non-adherence, etc. (Efron and Feldman, 1991; Lamiraud and Geoffard, 2007) will typically jeopardize both the internal and external validity of results and inferences based on such data. Floras are typically compliant with treatment protocols, but human fauna will often fail to be. Whereas randomized trial provides a solid conceptual foundation for thinking about an ideal data-generating experiment (Permutt and Hebel, 1989, for a specific example executed in an instrumental variable context), its actual implementation often falls short of the ideal. When contemplating the analysis of health (or any other) data, it can generally be more helpful to appreciate that such data are themselves often generated by purposive decisions of data suppliers and demanders (Philipson, 1997).

In many instances, the particular nature of the data to be analyzed by health econometricians sets health econometrics apart from other domains of applied econometrics. Many of the measurement and sampling approaches used to describe health-related phenomena as well as the consumer, producer, and market decisions and processes from which such data arise are more or less unique to health economics. Econometric methods used to analyze such outcomes data – censored, bounded, discrete, ordered, etc. – have often been developed by analysts working primarily in health economics (Newhouse, 1987). Even so, health econometricians have sometimes failed to be sufficiently sensitive to the fundamental measurement features of the data they analyze, e.g., estimating moments of ordinal scale outcomes like self-reported health status obtained using Likert scale or analogous strategies (Stevens, 1946).

Regardless of the particular questions at hand, the ability to move from conceptualizing such analysis to implementing it has required both individual-level (or micro-) data describing the choices and outcomes of health producing consumers and suppliers observed over space, over time, or both, as well as a rapid evolution of analytical and data management that has permitted such data to be analyzed using state of the art methodologies (e.g., Stata, Limdep, R, and others; Renfro, 2004 for a general discussion). Given the sensitive nature of many topics with which health economists deal at the household, institution, market, and population levels, ideal data may sometimes not be available for analysis owing to a variety of privacy protection protocols that have legal standing in most countries. Nonetheless, the progress that has been made in advancing empirical understanding of such phenomena is remarkable.

Interested readers may find as a useful starting point Andrew Jones’s (2000) seminal and comprehensive overview of health econometrics topics. The articles in this section complement in some respects Jones’s overview and, in the light of the ongoing rapid pace of conceptual and methodological developments in the field, bring some of the topics he addressed over 10 years ago into newer light.


  1. Efron, B. and Feldman, D. (1991). Compliance as an explanatory variable in clinical trials. Journal of the American Statistical Association 86, 9–17.
  2. Jones, A. M. (2000). Health econometrics. In Culyer, A. J. and Newhouse, J. P. (eds.) Handbook of health economics, 1st ed., vol. 1, ch. 6, pp. 265–344. Amsterdam: Elsevier.
  3. Lamiraud, K. and Geoffard, P. Y. (2007). Therapeutic non-adherence: A rational behavior revealing patient preferences ? Health Economics 16, 1185–1204.
  4. McClellan, M., McNeil, B. J. and Newhouse, J. P. (1994). Does more intensive treatment of acute myocardial infarction in the elderly reduce mortality? Analysis using instrumental variables. Journal of the American Medical Assocation 272, 859–866.
  5. Newhouse, J. P. (1987). Health economics and econometrics. American Economic Review Papers and Proceedings 77, 269–274.
  6. Permutt, T. and Hebel, J. R. (1989). SImultaneous-equation estimation in a clinical trial of the effect of smoking on birth weight. Biometrics 45, 619–622.
  7. Philipson, T. (1997). Data markets and the production of surveys. Review of Economic Studies 64, 47–72.
  8. Renfro, C. G. (2004). Econometric software: The first fifty years in perspective. Journal of Economic and Social Measurement 29, 9–107.
  9. Rosenzweig, M. R. and Schultz, T. P. (1983). Estimating a household production function: Heterogeneity, the demand for health inputs, and their effects on birth weight. Journal of Political Economy 91, 723–746.
  10. Stevens, S. S. (1946). On the theory of scales of measurement. Science 103, 677–680.