obsofint: Stata module to display observations of interest

Authors: Maarten L. Buis and Ronnie Babigumira


obsofint is intended to help scan a large number of variables for unusual observations. These unusual observations are unusual in the sense that they either have a much smaller or a much larger value on a given variable than the bulk of the data. These observations of interest are identified using a criterium that is an adaptation of the commonly used Tukey bounds used in for example boxplots. In our experience these Tukey bounds flag too many values as extreme values if a variable is either skewed or has a spike (a value that is very common). The rule used in obsofint is meant to result in less false positives in case skewed or spiked variables.

Supporting material


. sysuse auto, clear (1978 Automobile Data)

. obsofint, idlist(make) variable make is a string variable and will not be checked ------------------------------------------------------------------------------- Observations of interest report for price; Price

+---------------------------------+ | obs_nr make price | |---------------------------------| | 12 Cad. Eldorado 14,500 | | 13 Cad. Seville 15,906 | +---------------------------------+