Comparing observed and theoretical distributions

In this talk, I aim to discuss tools to compare the observed distribution of a variable with the theoretical distribution assumed by a model. In particular, I will focus on the situation where a model assumes a certain distribution for the explained/dependent/y variable and one or more parameters of this distribution, often the mean, change when one or more explanatory/independent/x variables change. The challenge is that the dependent variable no longer follows the theoretical distribution, but rather follows a mixture of these theoretical distributions. In the case of a linear regression, we can circumvent this difficulty by looking at the residuals, which should follow a normal distribution. However, this circumvention does not generalize to other models. I will show the margdistfit package, which graphically compares the distribution of the dependent variable with the theoretical mixture distribution.

The following steps need to be taken in order to view this presentation:

  1. Extract the files in the .zip file to a separate directory.
  2. Start Stata
  3. make sure that -hangroot- and -margdistfit- are installed. This can be done by typing: ssc install hangroot and ssc install margdistfit.
  4. Use cd to change to directory where you extracted the files from the .zip file.
  5. Type view main.smcl.

The presentation

MLB_Berlin_2012.zip