Unobserved heterogeneity in logistic regression

Abstract

Regression is used to find the relation between various explanatory variables and an explained variable. Often we do not have all variables that influence the explained variable. This is not a problem in a linear regression model, as long as the expected value or mean of the unmeasured variables is zero and does not change for different values of the measured variables (i.e. the unmeasured and measured explanatory variables are uncorrelated). The idea behind this is simple: The estimated regression may sometimes overestimate and sometimes underestimate the explained variable, but these positive and negative errors cancel each other out. Unfortunately, this canceling out does not occur in logistic regression. This paper will illustrate this using a simple example.

Full text

Unobserved heterogeneity in logistic regression