```                               Graphics in Stata
Graphs in the background

Maarten Buis
office F532
maarten.buis@uni.kn
office hours by appointment

```
```    histograms and bar graphs
```

```       Highlighting sub-populations in a histogram
```

```       Highlighting bars
```
```
```
```    scatterplots
```

```       Highlighting sub-populations in a scatter plot
```

```       Line on top of a scatter plot
```

```       By graph
```
```
```
```    line plots
```

```       Line graph
```

```-------------------------------------------------------------------------------
histograms and bar graphs
-------------------------------------------------------------------------------

Highlighting sub-populations in a histogram

We can compare distributions using histogram with the by() option.

```
```
. sysuse nlsw88, clear
(NLSW, 1988 extract)

. twoway histogram wage,     ///
>     width(1) freq          ///
>     name(hist1, replace)

```
```    We can also highlight the part of the histogram that are college

```
```
. twoway histogram wage,                       ///
>     width(1) freq                            ///
>     bcolor(gs14)  blw(*.4) blcolor(black) || ///
> histogram wage if collgrad,                  ///
>     width(1) freq legend(off)                ///
>     bcolor(gs6) blw(*.4) blcolor(black)      ///
>     title(distribution of hourly wage)       ///
>     name(hist2, replace)

```
```
```

`-------------------------------------------------------------------------------`

` index >>`
```-------------------------------------------------------------------------------
```
```-------------------------------------------------------------------------------
histograms and bar graphs
-------------------------------------------------------------------------------

Highlighting bars

We can graph hbar to display the mean wage by industry.

```
```
. sysuse nlsw88, clear
(NLSW, 1988 extract)

. graph hbar (mean) wage,                       ///
>     over(industry, descending sort(1))        ///
>     name(bar1, replace)

```
```
Sometimes we want to highlight one bar.

This is particularly common for presentations.

```
```
. collapse (mean) wage, by(industry)

. separate wage, by(industry == 12)

storage   display    value
variable name   type    format     label      variable label
-------------------------------------------------------------------------------
wage0           float   %9.0g                 wage, !(industry == 12)
wage1           float   %9.0g                 wage, industry == 12

. graph hbar (asis) wage0 wage1,                   ///
>     over(industry, descending sort(wage)) nofill ///
>     bar(1, bfcolor(none)) legend(off)            ///
>     ytitle(mean wage)                            ///
>     name(bar2, replace)

```
```
```

`-------------------------------------------------------------------------------`

`<< index >>`
```-------------------------------------------------------------------------------
```
```-------------------------------------------------------------------------------
scatterplots
-------------------------------------------------------------------------------

Highlighting sub-populations in a scatter plot

We may want to highlight a certain sub-polation in a scatter plot

```
```
. sysuse nlsw88, clear
(NLSW, 1988 extract)

. graph drop _all

>     name(scatter1, replace)

```
```    Grade is discrete, but we can show the individual values by adding the
jitter() option.

```
```
. scatter wage grade, jitter(2) ///
>     name(scatter2, replace)

```
```    The jitter() option adds random noise in both the x and y direction, but
we only need a jitter in the x-direction

```
```
(2 missing values generated)

>     name(scatter3, replace)

```
```    It would be nice if we could copy the variable label from grade

```
```

>     name(scatter4, replace)

```
```    It makes more sense to display wage on a log scal

```
```
. scatter wage grade2, yscale(log) ///
>     name(scatter5, replace)

```
```    But we need to adjust the axis labels a bit.

```
```
. scatter wage grade2, yscale(log) ///
>     ylab(1.25 2.5 5 10 20 40)    ///
>     name(scatter6, replace)

```
```    Now we can highlight the blacks

```
```
. scatter wage grade2, yscale(log)       ///
>     msymbol(Oh) mcolor(gs10) ||        ///
> scatter wage grade2 if race == 2 ,     ///
>     msymbol(O) mcolor(gs2)             ///
>     yscale(log) legend(off)            ///
>     ylab(1.25 2.5 5 10 20 40)          ///
>     name(scatter7, replace)

```
```
```

`-------------------------------------------------------------------------------`

`<< index >>`
```-------------------------------------------------------------------------------
```
```-------------------------------------------------------------------------------
scatterplots
-------------------------------------------------------------------------------

Line on top of a scatter plot

We can plot a line on top of a scatter plot

```
```
. graph drop _all

. sysuse nlsw88, clear
(NLSW, 1988 extract)

.
(2 missing values generated)

.
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
grade |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0   15353.13   1025.547    0.000
linear |    1   14345.16     17.577    0.001   1
m = 1 |    2   14345.16     17.577    0.000   1
m = 2 |    4   14327.59      0.000       --   3 3
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =      2,244
LR chi2(2)        =    1025.55
Prob > chi2       =     0.0000
Log likelihood =  -7163.793                     Pseudo R2         =     0.0668

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
grade_1 |   .5388408   .0445881    12.08   0.000     .4514496    .6262319
grade_2 |  -.5224457   .0572099    -9.13   0.000     -.634575   -.4103163
_cons |   1.035362   .0659192    15.71   0.000     .9061632    1.164562
------------------------------------------------------------------------------

. predict wagehat
(option n assumed; predicted number of events)
(2 missing values generated)

. twoway scatter wage grade2,                         ///
>            yscale(log) ylab(1.25 2.5 5 10 20 40) || ///
>        line wagehat grade, sort legend(off)         ///
>             name(fit1, replace)

```
```    The scatter plot is a bit too dominant

```
```
. twoway scatter wage grade2, mcolor(gs10)            ///
>            yscale(log) ylab(1.25 2.5 5 10 20 40) || ///
>        line wagehat grade, sort legend(off)         ///
>             name(fit2, replace)

```
```    The line can be made solid

```
```
. twoway scatter wage grade2, mcolor(gs10)            ///
>            yscale(log) ylab(1.25 2.5 5 10 20 40) || ///
>        line wagehat grade, sort legend(off)         ///
>            lpattern(solid)                          ///
>            name(fit3, replace)

```
```    We can also make the line thicker

```
```
. twoway scatter wage grade2, mcolor(gs10)            ///
>            yscale(log) ylab(1.25 2.5 5 10 20 40) || ///
>        line wagehat grade, sort legend(off)         ///
>            lpattern(solid) lwidth(*3)               ///
>            name(fit4, replace)

```
```
```

`-------------------------------------------------------------------------------`

`<< index >>`
```-------------------------------------------------------------------------------
```
```-------------------------------------------------------------------------------
scatterplots
-------------------------------------------------------------------------------

By graph

Lets look at the relationship between wage and education by industry

```
```
. graph drop _all

. sysuse nlsw88, clear
(NLSW, 1988 extract)

(2 missing values generated)

.
. scatter wage gradej, by(industr, note("") compact) ///
>     yscale(log) ylab(1.25 2.5 5 10 20 40)          ///
>     name(byscatter1, replace)

```
```    We can sort the sub graphs by mean wage

```
```
. bys industry : egen mwage = mean(wage)

. replace mwage = -mwage

. egen Industry = axis(mwage industry), label(industry)
(14 missing values generated)

. scatter wage gradej, by(Industr, note("") compact) ///
>     yscale(log) ylab(1.25 2.5 5 10 20 40)          ///
>     name(byscatter2, replace)

```
```    We can display the entire sample as a background graph

```
```
. sysuse nlsw88, clear
(NLSW, 1988 extract)

. keep if !missing(industry, wage, grade)
(16 observations deleted)

. bys industry : gen id = _n

. fillin industry id

.
(7,646 missing values generated)

.
. bys industry : egen mwage = mean(wage)

. replace mwage = -mwage

. egen Industry = axis(mwage industry), label(industry)

.
. tempfile temp

. save `temp'

.
. keep wage industry gradej id

. reshape wide wage gradej, j(industry) i(id)
(note: j = 1 2 3 4 5 6 7 8 9 10 11 12)

Data                               long   ->   wide
-----------------------------------------------------------------------------
Number of obs.                     9876   ->     823
Number of variables                   4   ->      25
j variable (12 values)         industry   ->   (dropped)
xij variables:
wage   ->   wage1 wage2 ... wage12
-----------------------------------------------------------------------------

. merge 1:m id using `temp'

Result                           # of obs.
-----------------------------------------
not matched                             0
matched                             9,876  (_merge==3)
-----------------------------------------

.
. forvalues i = 1/12 {
2.         local backgr `backgr' scatter wage`i' gradej`i', ///
>                           msymbol(Oh) mcolor(gs12) ||
3. }

.
. twoway `backgr'                        ///
>            msymbol(O) mcolor(black)    ///
>            by(Industry, legend(off)    ///
>               compact note(""))        ///
>            yscale(log)                 ///
>            ylab(1.25 2.5 5 10 20 40)   ///
>            name(byscatter3, replace)

```
```
```

`-------------------------------------------------------------------------------`

`<< index >>`
```-------------------------------------------------------------------------------
```
```-------------------------------------------------------------------------------
line plots
-------------------------------------------------------------------------------

Line graph

Lets look at a spagetti plot

```
```
. graph drop _all

. sysuse nlsw88, clear
(NLSW, 1988 extract)

. gen ed2 = grade + 1
(2 missing values generated)

.
. bys industry : egen mwage = mean(wage)

. replace mwage = -mwage

. egen Industry = axis(mwage industry), label(industry)
(14 missing values generated)

.
. gen wagehat = .
(2,246 missing values generated)

. forvalues i = 1/12 {
2.         fp <ed2>, replace : poisson wage <ed2> if industry == `i'
3.     predict wage`i' if e(sample)
4.     replace wagehat = wage`i' if industry == `i'
5.     local gr `gr' line wage`i' grade , sort ||
6. }
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0     86.286      6.070    0.194
linear |    1     81.524      1.308    0.727   1
m = 1 |    2     81.178      0.962    0.618   0
m = 2 |    4     80.216      0.000       --   3 3
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =         17
LR chi2(2)        =       6.07
Prob > chi2       =     0.0481
Log likelihood = -40.108089                     Pseudo R2         =     0.0703

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |   .0026413   .0015905     1.66   0.097     -.000476    .0057586
ed2_2 |  -.0008486   .0005314    -1.60   0.110    -.0018901    .0001929
_cons |   .7990031   .4674919     1.71   0.087    -.1172642     1.71527
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(2,229 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0     62.963      0.000    1.000
linear |    0     62.963      0.000    1.000   1
m = 1 |    1     62.963      0.000    1.000   -2
m = 2 |    2     62.963      0.000       --   -2 -2
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =          4
LR chi2(0)        =      -0.00
Prob > chi2       =          .
Log likelihood = -31.481608                     Pseudo R2         =    -0.0000

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |          0  (omitted)
ed2_2 |          0  (omitted)
_cons |   2.731089   .1276208    21.40   0.000     2.480957    2.981221
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(2,242 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0    178.590     26.619    0.000
linear |    1    157.834      5.863    0.118   1
m = 1 |    2    155.601      3.630    0.163   3
m = 2 |    4    151.971      0.000       --   -2 -1
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =         29
LR chi2(2)        =      26.62
Prob > chi2       =     0.0000
Log likelihood = -75.985466                     Pseudo R2         =     0.1491

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |   659.2548   169.8953     3.88   0.000     326.2661    992.2436
ed2_2 |  -129.3304   29.20694    -4.43   0.000     -186.575   -72.08586
_cons |   7.916284   1.217063     6.50   0.000     5.530885    10.30168
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(2,217 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0   2371.656    191.684    0.000
linear |    1   2193.047     13.076    0.004   1
m = 1 |    2   2181.346      1.374    0.503   3
m = 2 |    4   2179.972      0.000       --   .5 .5
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =        366
LR chi2(2)        =     191.68
Prob > chi2       =     0.0000
Log likelihood = -1089.9859                     Pseudo R2         =     0.0808

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |  -8.405453   1.958896    -4.29   0.000    -12.24482   -4.566087
ed2_2 |   2.014951   .4257893     4.73   0.000      1.18042    2.849483
_cons |   13.60704   3.119369     4.36   0.000     7.493194     19.7209
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(1,880 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0    614.541      5.878    0.208
linear |    1    612.125      3.462    0.326   1
m = 1 |    2    610.776      2.113    0.348   -2
m = 2 |    4    608.664      0.000       --   -2 -2
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =         90
LR chi2(2)        =       5.88
Prob > chi2       =     0.0529
Log likelihood = -304.33176                     Pseudo R2         =     0.0096

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |  -994.1185   674.1365    -1.47   0.140    -2315.402    327.1648
ed2_2 |   439.7353   314.1076     1.40   0.162    -175.9043    1055.375
_cons |   1.626314   .7887816     2.06   0.039     .0803306    3.172298
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(2,156 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0   2156.287     73.605    0.000
linear |    1   2086.225      3.542    0.315   1
m = 1 |    2   2085.079      2.397    0.302   .5
m = 2 |    4   2082.682      0.000       --   3 3
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =        333
LR chi2(2)        =      73.60
Prob > chi2       =     0.0000
Log likelihood = -1041.3412                     Pseudo R2         =     0.0341

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |   .0021576   .0005586     3.86   0.000     .0010628    .0032524
ed2_2 |  -.0006669    .000184    -3.62   0.000    -.0010276   -.0003062
_cons |   .8044136   .1905101     4.22   0.000     .4310207    1.177807
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(1,913 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0   1737.107     73.345    0.000
linear |    1   1666.249      2.487    0.478   1
m = 1 |    2   1665.643      1.882    0.390   0
m = 2 |    4   1663.762      0.000       --   -2 -2
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =        192
LR chi2(2)        =      73.35
Prob > chi2       =     0.0000
Log likelihood = -831.88086                     Pseudo R2         =     0.0422

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |   648.0126   294.5432     2.20   0.028     70.71854    1225.307
ed2_2 |  -377.2919    139.903    -2.70   0.007    -651.4967   -103.0871
_cons |   4.053157   .3835413    10.57   0.000      3.30143    4.804884
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(2,054 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0    641.527     49.335    0.000
linear |    1    595.436      3.244    0.355   1
m = 1 |    2    595.436      3.244    0.197   1
m = 2 |    4    592.192      0.000       --   3 3
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =         86
LR chi2(2)        =      49.33
Prob > chi2       =     0.0000
Log likelihood =  -296.0959                     Pseudo R2         =     0.0769

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |   .0025188   .0008848     2.85   0.004     .0007846    .0042531
ed2_2 |  -.0007744   .0002898    -2.67   0.008    -.0013425   -.0002064
_cons |   .7632976   .3117176     2.45   0.014     .1523424    1.374253
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(2,160 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0    423.050      4.860    0.302
linear |    1    418.671      0.480    0.923   1
m = 1 |    2    418.591      0.400    0.819   2
m = 2 |    4    418.191      0.000       --   -2 -1
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =         97
LR chi2(2)        =       4.86
Prob > chi2       =     0.0881
Log likelihood = -209.09541                     Pseudo R2         =     0.0115

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |   130.0095   106.6677     1.22   0.223    -79.05536    339.0743
ed2_2 |  -30.43477   20.82213    -1.46   0.144    -71.24539    10.37584
_cons |   3.075397   .9746499     3.16   0.002     1.165118    4.985675
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(2,149 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0    100.921      5.516    0.238
linear |    1     99.544      4.139    0.247   1
m = 1 |    2     98.785      3.379    0.185   3
m = 2 |    4     95.405      0.000       --   -2 -2
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =         17
LR chi2(2)        =       5.52
Prob > chi2       =     0.0634
Log likelihood = -47.702622                     Pseudo R2         =     0.0547

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |   1538.059   669.9163     2.30   0.022     225.0476    2851.071
ed2_2 |  -783.1929   337.3305    -2.32   0.020    -1444.348   -122.0373
_cons |   4.459298   1.073041     4.16   0.000     2.356176     6.56242
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(2,229 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0   5214.918    468.916    0.000
linear |    1   4752.161      6.159    0.104   1
m = 1 |    2   4752.161      6.159    0.046   1
m = 2 |    4   4746.002      0.000       --   3 3
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =        823
LR chi2(2)        =     468.92
Prob > chi2       =     0.0000
Log likelihood = -2373.0012                     Pseudo R2         =     0.0899

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |   .0015904   .0003063     5.19   0.000     .0009901    .0021908
ed2_2 |  -.0004678   .0000987    -4.74   0.000    -.0006613   -.0002743
_cons |   .9341858   .1233811     7.57   0.000     .6923632    1.176008
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(1,423 missing values generated)
(fitting 44 models)
(....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%)

Fractional polynomial comparisons:
--------------------------------------------------------------------
ed2 |   df    Deviance   Dev. dif.   P(*)   Powers
-------------+------------------------------------------------------
omitted |    0   1068.569     98.960    0.000
linear |    1    972.108      2.499    0.475   1
m = 1 |    2    969.831      0.222    0.895   3
m = 2 |    4    969.609      0.000       --   3 3
--------------------------------------------------------------------
(*) P = sig. level of model with m = 2 based on chi^2 of dev. dif.

Poisson regression                              Number of obs     =        176
LR chi2(2)        =      98.96
Prob > chi2       =     0.0000
Log likelihood = -484.80427                     Pseudo R2         =     0.0926

------------------------------------------------------------------------------
wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ed2_1 |  -.0002407   .0008374    -0.29   0.774    -.0018821    .0014006
ed2_2 |   .0001278   .0002707     0.47   0.637    -.0004028    .0006583
_cons |    1.83251     .32752     5.60   0.000     1.190583    2.474438
------------------------------------------------------------------------------
(option n assumed; predicted number of events)
(2,070 missing values generated)

. twoway line wagehat grade, connect(L) name(spagetti1, replace)

```
```    Maybe this becomes more informative if we plot these lines by industry
with all the lines as a grey background

```
```
. bys Industry : gen id = _n

. fillin Industry id

. tempfile temp

. save `temp'

. forvalues i = 1/12 {
2.         bys id (wage`i') : replace wage`i' = wage`i'[1]
4. }
(8,463 missing values generated)
(8,502 missing values generated)
(8,424 missing values generated)
(5,538 missing values generated)
(8,307 missing values generated)
(5,967 missing values generated)
(7,306 missing values generated)
(8,190 missing values generated)
(8,125 missing values generated)
(8,411 missing values generated)
(13 missing values generated)
(7,566 missing values generated)

. keep if Industry == 1
(9,888 observations deleted)

. merge 1:m id using `temp'

Result                           # of obs.
-----------------------------------------
not matched                             0
matched                            10,712  (_merge==3)
-----------------------------------------

.
. local gr ""

. forvalues i = 1/12 {
2.         local gr `gr' line wage`i' grade`i' , ///
>         lpattern(solid) lcolor(gs12) sort ||
3. }

.
. twoway `gr'                                           ///
>        line wagehat grade ,                           ///
>            by(Industry, legend(off) compact note("")) ///
>            sort yscale(log) ylab(2.5 5 10 20)         ///
>            lpattern(solid) lcolor(black) lwidth(*3)   ///
>            name(spagetti2, replace)

```
```    We can also display a "rug" underneat to show the number of observations
on which these lines are based.

```
```
. gen pipe = "|"

. gen y = 1.8

(8,468 missing values generated)

. local gr ""

. forvalues i = 1/12 {
2.         local gr `gr' line wage`i' grade`i' , ///
>         lpattern(solid) lcolor(gs12) sort ||
3. }

.
. twoway `gr'                                           ///
>        line wagehat grade ,                           ///
>            by(Industry, legend(off) compact note("")) ///
>            sort yscale(log) ylab(2.5 5 10 20)         ///
>            lpattern(solid) lcolor(black) lwidth(*3)   ///
>            xtitle(grade) ytitle(predicted wage) ||    ///
```
`-------------------------------------------------------------------------------`
`<< index `
```-------------------------------------------------------------------------------