British Academy: The UK's National Academy for the Humanities and Social Sciences
Enquiry, Evidence and Facts: An Interdisciplinary Conference
Model Contingent Interpretation of Evidence
Professor Andrew Chesher
Centre for Microdata Methods and Practice & Institute for Fiscal Studies (IFS)
University College London, Gower Street, London WC1E 6BT
An abstract presented to the conference
‘Enquiry, Evidence and Facts: An Interdisciplinary conference’
at the British Academy, London, on 14 December 2007
Biography
Andrew Chesher is the Director of the ESRC Research Centre for Microdata Methods and Practice and Professor of Economics at University College London.
Before he joined UCL in 1999 he was, for 15 years, Professor of Econometrics at the University of Bristol. He is a Research Fellow of the Institute for Fiscal Studies, Fellow of the British Academy, Fellow of the Econometric Society and Co-Editor of the Econometric Society Monograph Series.
His research interests cover many areas of microeconometric theory and practice. Since around 2000 much of his research has focused on the analysis of the identifying power of weakly restrictive semi- and non-parametric models of economic and social processes.
Abstract
In a number of fields, data are generated by some process of which there is imperfect knowledge and one wishes to extract information about some feature of the process from the data it generates. Data (evidence) are generated by a process whose properties are unknown. Data are realisations of random variables which have a probability distribution. Observationally equivalent processes generate identical distributions for observed variables. If a feature of interest varies across observationally equivalent processes then the feature is not identifiable. A model identifies a feature if its value does not vary within sets of observationally equivalent processes that the model admits. Weakly restrictive models may not be falsifiable, so interpretation of data is contingent on the truth of a model's restrictions. One central research question is what knowledge of a process is in principle obtainable from the data (evidence) it generates? In particular, what constitutes a weakly restrictive model for particular problems and how may inference be conducted in such models? Formal models of processes can provide restrictions which can be sufficient to identify features of a data generating process (structural features).
This project has investigated the nature of the restrictions that are required to identify interesting features and seek to determine minimally restrictive models for particular structural features. A related but separate research question is how can data be processed to give information about identified structural features? The project has developed methods for estimation and inference in the context of models embodying weak identifying restrictions. The project builds on a stream of research on the subject of identification and inference started in econometrics in the 1920’s and pursued since in economics and other areas of social science.
To convey main ideas behind this project, we consider the following example. Suppose that there are observable random variables (X,Y) and a vector of covariates Z. Consider models defined by:
Y = h(X,U)
X = g(Z,V)
subject to (weak) restrictions, e.g. additivity, monotonicity, multiplicity of sources of variation. There is interest in features of h. It is necessary to make assumptions regarding the stochastic relationships between variables. For example, one may assume that unobservables (U, V) and Z are (locally) independent.
As an empirical example, consider Y = wages, X = university education, U = unobserved factor that affects wages, V = unobserved factors that affect schooling. Most often, U and V are correlated. This suggests that in order to learn out the causal relationship between wages and education, we need to find a variable Z that is locally independent of (U,V) (Instrumental Variables). One suggested variable in the economics literature using the US data is the distance to the nearest university when an individual was at college enrolment age.
This project has developed identification results and inference methods for the models including the one described above. In particular, some of those results are contained in the following papers.
Selected Research Outcomes
- Andrew Chesher, Endogeneity and discrete outcomes, March 2007, Cemmap Working Papers, CWP05/07.
This paper studies models for discrete outcomes which permit explanatory variables to be endogenous. We have only set identifying power when the outcome is discrete. - Andrew Chesher, Instrumental values, January 2007, Journal of Econometrics, July, Vol. 139, No. 1, 15-34.
This paper studies identification of partial differences of nonseparable structural functions. - Andrew Chesher, Nonparametric identification under discrete variation, September 2005, Econometrica, Vol. 73, No. 5, p. 1525-1550
This paper provides weak conditions under which there is nonparametric interval identification of local features of a structural function that depends on a discrete endogenous variable and is nonseparable in latent variates. - Andrew Chesher, Identification in nonseparable models, September 2003, Econometrica, Vol. 71, No. 5, pp. 1405-1441
Weak nonparametric restrictions are developed, sufficient to identify the values of derivatives of structural functions in which latent random variables are nonseparable. - Joel L. Horowitz and Sokbae Lee, Nonparametric instrumental variables estimation of a quantile regression model, July 2007, Econometrica, Vol. 75, No.4, pp. 1191-1208
This paper considers nonparametric estimation of a regression function that is identified by requiring a specified quantile of the regression "error" conditional on an instrumental variable to be zero. - Sokbae Lee, Identification of a competing risks model with unknown transformations of latent failure times, December 2006, Biometrika, Vol. 93, No.4, pp. 996-1002.
This paper is concerned with identification of a competing risks model with unknown transformations of latent failure times.