Stata level 2
Tips and tricks for working with Stata
-------------------------------------------------------------------------------
Stata basics
Passing commands to Stata
Work flow
Navigation
Customize your Stata
information available in a Stata dataset
Additional information available in a stata session
returned results
macros
Macro
what is in a macro?
Macros containing numbers
Scalars
compound quotes
= or no =
local versus global
tempname tempfile tempvar
Passing local macros to another .do file
Leaving local macros behind
c_local
extended macro functions
looping
An example loop
different types of loops
Try it yourself
prefixes
by
statsby
finding and manipulating groups of variables
finding variables
numerical precision
binary versus decimal
how a number is stored
rounding errors: adding and subtracting
storage types
possible problems
programming
defining a program in a .do file
Why would you want to do that?
Application: writing your own program for creating a codebook in .html
-------------------------------------------------------------------------------
Goal
Starting
Add some lines to our .html file to make it prettier
add a replace option
Let the user specify a meaningful title
What should the title be when the title() option is not specified?
Add a list of variable names to our codefile
Adding varialble labels to our variable list
Add data label to codebook
Add data notes to codebook
Splitting the program up in smaller subroutines
Add variable notes to the variable list
Add a list of value labels to our codebook
Add frequencies to our labels
Numeric variables without value labels
display format for summary statistics
Add other summary statistics
String variables
Add links between the variable list and the value label list
Turn this into an .ado file
What is still left?
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- Stata basics -------------------------------------------------------------------------------Passing commands to Stata
There are three ways of telling Stata what you want it to do
click on the menu. I do this very rarely, mainly to look for files or import files from excel.
type in the command window. I do this a lot, but only for experimentation.
write your commands in a .do file. This is the main way of doing things. It allows me to keep track of what I am doing, and it keeps a paper trail of what I have done in case someone wants to replicat what I have done.
A result can only appear in my article or presentation if it is the result of running a .do file
I do experiment in the command window, but the command I am happy with has to be coppied in the .do file.
I mainly work in Stata's do-file editor, which you can start by typing doedit
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- Stata basics -------------------------------------------------------------------------------Work flow
Real research is too long for a single .do file. Instead you need to break this up in several smaller do files.
Create one master .do file that executes these in turn; a .do file can execute another .do file (say cceast_dta01.do, by including the command do cceast_dta01.do
Have a naming system for your .do files. I typically have a small abbreviation of a project (e.g. cceast), and add to that either _dta or _ana for data preparation or analysis files. After that I add a number.
Numbering the files prevents names like "final", "really_final", "no_seriously_I_am_done", etc.
I have at least two directories: "working" and "posted". I can change anything in the working directory, but once something is in posted I cannot change it anymore.
I will put files in posted when I present my results at a conference or submit a paper to a journal. This ensures I can always replicate those results.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- Stata basics -------------------------------------------------------------------------------Navigation
Stata has a working directory and system directories.
The working directory is where Stata will look for (data and .do) files when you don't specify the directory.
If you type in Stata pwd (print working directory) it will tell you where Stata is.
You can change the working directory using {cmd cd}
You can also use relative paths. Say you are in the working directory, and your data is stored in posted/data, you can type use ../posted/data/datafile.dta.
This is useful when you work on different computers. If you have to set the directory only once, in the master .do file, and all other .do files will only use relative paths, then you have to change the cd command only once in order to make it work on your other computers.
system directories is where Stata looks for its programs, help-files, etc.
You can find out where that is by typing sysdir
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- Stata basics -------------------------------------------------------------------------------Customize your Stata
You can set the scheme of your output window using edit --> preferences --> general preferences
You can set the font by right clicking on a window and choose font... I like lucida console
You can let Stata execute a couple of commands every time it starts up. By creating a .do file called profile.do and store it in your PERSONAL folder (see: sysdir).
My profile.do reads:
noi di as txt _n"Current projects:"
noi di as txt "F4" as result " SS18 Stata_L2" global F4 cd "D:\Mijn documenten\onderwijs\konstanz\ss18\stata_l2\";
exit
This makes that if I press F4 Stata will cd to the directory of this course.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- Stata basics -------------------------------------------------------------------------------information available in a Stata dataset
The main place where we store and access information is the dataset, and specifically the values a variable takes.
. sysuse auto, clear (1978 Automobile Data)
. browse
. list foreign rep78 in 1/10, nolabel
+-----------------+ | foreign rep78 | |-----------------| 1. | 0 3 | 2. | 0 3 | 3. | 0 . | 4. | 0 3 | 5. | 0 4 | |-----------------| 6. | 0 3 | 7. | 0 . | 8. | 0 3 | 9. | 0 3 | 10. | 0 3 | +-----------------+
. display rep78 3
. display rep78[5] 4
The dataset can contain additional information in the form of variable, value, and data labels
. desc
Contains data from C:\Program Files (x86)\Stata15\ado\base/a/auto.dta obs: 74 1978 Automobile Data vars: 12 13 Apr 2016 17:45 size: 3,182 (_dta has notes) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- make str18 %-18s Make and Model price int %8.0gc Price mpg int %8.0g Mileage (mpg) rep78 int %8.0g Repair Record 1978 headroom float %6.1f Headroom (in.) trunk int %8.0g Trunk space (cu. ft.) weight int %8.0gc Weight (lbs.) length int %8.0g Length (in.) turn int %8.0g Turn Circle (ft.) displacement int %8.0g Displacement (cu. in.) gear_ratio float %6.2f Gear Ratio foreign byte %8.0g origin Car type ------------------------------------------------------------------------------- Sorted by: foreign
. fre rep78
rep78 -- Repair Record 1978 ----------------------------------------------------------- | Freq. Percent Valid Cum. --------------+-------------------------------------------- Valid 1 | 2 2.70 2.90 2.90 2 | 8 10.81 11.59 14.49 3 | 30 40.54 43.48 57.97 4 | 18 24.32 26.09 84.06 5 | 11 14.86 15.94 100.00 Total | 69 93.24 100.00 Missing . | 5 6.76 Total | 74 100.00 -----------------------------------------------------------
desc also returns the type of the variable, this gives information about whether a variable a string or a numeric variable, and if it is a numeric variable about the range of possible values.In addition desc also provides information about the format of a variable, that is, how the values are supposed to be displayed. This can be particularly useful for finding time variables.
We can add additional comments to variables and datasets in addition to the labels using notes. A more general version of notes are characteristics. These are used to store information in the dataset when xtset or stset the data.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- Stata basics -------------------------------------------------------------------------------Additional information available in a stata session
You can store information in macros
Stata also has the possibility to store information in scalars or matrices.
Many commands return results, which can be accessed until the next command is run.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- Stata basics -------------------------------------------------------------------------------returned results
Many command leave the results behind in memory. These are returned results.
. sum mpg
Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- mpg | 74 21.2973 5.785503 12 41
. return list
scalars: r(N) = 74 r(sum_w) = 74 r(mean) = 21.2972972972973 r(Var) = 33.47204738985561 r(sd) = 5.785503209735141 r(min) = 12 r(max) = 41 r(sum) = 1576
. di r(mean) 21.297297
. . sum mpg, meanonly
. return list
scalars: r(N) = 74 r(sum_w) = 74 r(sum) = 1576 r(mean) = 21.2972972972973 r(min) = 12 r(max) = 41
Because each command can return results, you should not expect these to persist for long. If you need them store the desired results immediately after the command, e.g. in a local macroThere are various types of returned results:
returned results (r(something), and return list), these come from general, non-estimation commands.
ereturned results (e(something) and ereturn list), these come from estimation commands, like regress
sreturned results (s(something) and sreturn list), these come from subprograms.
creturned results (c(something)) contains the value of system parameters and settings, along with certain constants such as the value of pi.A full list can be found at help creturn.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------Macro
A macro is a shorthand, it is one thing standing for another
. sysuse auto, clear (1978 Automobile Data)
. local xvars = "mpg foreign"
. reg price `xvars'
Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(2, 71) = 14.07 Model | 180261702 2 90130850.8 Prob > F = 0.0000 Residual | 454803695 71 6405685.84 R-squared = 0.2838 -------------+---------------------------------- Adj R-squared = 0.2637 Total | 635065396 73 8699525.97 Root MSE = 2530.9
------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494 foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368 _cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67 ------------------------------------------------------------------------------
. reg price mpg foreign
Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(2, 71) = 14.07 Model | 180261702 2 90130850.8 Prob > F = 0.0000 Residual | 454803695 71 6405685.84 R-squared = 0.2838 -------------+---------------------------------- Adj R-squared = 0.2637 Total | 635065396 73 8699525.97 Root MSE = 2530.9
------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494 foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368 _cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67 ------------------------------------------------------------------------------
In the second line of this example we defined a local macro called xvars, which stands for / contains the string "mpg foreign"The syntax is
local macroname content
If we later want to refer to the contents of the local macro xvars we type `xvars', that is the macroname with left and right single quotes.
When Stata sees a line it will first look for macros and replace that with its content
So when Stata saw line 3, the first thing it did is look up what was in the local macro called xvars, and replaced `xvars' with its contents, and only than did it try to execute the command.
So the third and fourth line are equivalent as far as Stata is concerned.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------what is in a macro?
It can contain anything, but is always a string
. local mac "foo"
. di `mac' foo not found r(111);
I said that the content of a macro is always a string, then why did this return an error?The double quotes are there to indicate that this is a string, but they are not part of the string.
So `mac' contains foo not "foo"
So for the second line Stata saw di foo, and since there were no double quotes Stata assumed we wanted to look at a variable (or scalar) foo, could not find that variable and returned the error message.
This will work:
. local mac "foo"
. di "`mac'" foo
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------Macros containing numbers
Because the quotes are stripped, macros may also contain numbers.
They are stored as strings, but as soon as Stata replaces the name of the macro with its content it will see them as numbers, as there are no quotes around them.
. local mac 1
. di `mac' 1
However, numbers stored in macros are not quite as precise as numbers stored in scalars.Scalars are stored in double precision (15-16 decimal digits), while locals have about 12 decimal digits, sometimes more, but never less than 11.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------Scalars
A scalar is a "container" containing one element, either a string or a number.
It is good practice to use tempnames for scalars as they share the same namespace as variables
Unlike a macro the name is not immediately replaced by its contents
. sum mpg, meanonly
. tempname m_mpg
. scalar `m_mpg' = r(mean)
. scatter mpg price, yline(`m_mpg') invalid line argument, __000006 r(198);
In order to replace the scalar name with its contents you can type `=scalarname'
. sum mpg, meanonly
. tempname m_mpg
. scalar `m_mpg' = r(mean)
. scatter mpg price, yline(`=`m_mpg'')
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------compound quotes
We can make the double quotes part of the string by surrounding them with compound quotes
. local mac `" "foo" "'
. di `mac' foo
. di `"`mac'"' "foo"
. di `"|`mac'|"' | "foo" |
. di "`mac'" foo" " invalid name r(198);
The content of the macro `mac' is in this example <space>"foo"<space>.So in the first di, Stata sees: di <space><space>"foo"<space>
In the second di, Stata sees: <space>`"<space>"foo"<space>"'
These spaces are more visible by surounding them with pipes: now Stata sees <space>`"|<space>"foo"<space>|"'
The final one gets into trouble because Stata does not know that the outer double quotes should "wrap around" the quotes in the macro.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------= or no =
If the content of a macro is a string, and quotes are stripped anyhow, then the quotes when defining the macro seem redundant
. local mac foo
. di "`mac'" foo
However, that is only true if we did not include a "="Including an "=" means that what comes after the equal sign is an expression, and the result of evaluating that expression is to be the content of the macro.
So when Stata sees local mac = foo, then Stata starts looking for the variable or scalar foo, which it can than put into the local macro `mac'. It cannot find it, and will return an error message
. local mac = foo foo not found r(111);
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------local versus global
We have thus far used local macros.
In a .do file they exist as long as that .do file is running, but disappear immediately afterwards.
So if you run a line of your do file which defines the local macro and than run another line that uses that local macro, that local macro will no longer exist.
That sounds awkward, but it is actually extremely useful.
The alternative is a global macro: a macro that persists after you are done running a .do file.
In a datapreparation or data analysis phase you can easily work hours on end, have (lunch) breaks in between etc. What happens when you use global macros? You defined them early in the morning and you may or may not have changed somewhere along the way. You can easily imagine a situation where you think your global contains one thing, but actually contains something else.
So global macros are dangerous, and generally considered bad practice.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------tempname tempfile tempvar
tempname, tempfile, and tempvar are special types of local macros.
tempfile is a local macro containing a name for a matrix or scalar that is guaranteed not to exist, and that will be removed at the end of the session.
tempvar is a local macro containing a filename that is guaranteed not to exist, and that will be removed at the end of the session.
tempname is a local macro containing a variable name that is guaranteed not to exist, and that will be removed at the end of the session.
These are good ways of storing intermediate results that you need to store for a very short time.
For example, I often use tempfile in combination with merge.
I prepare a dataset for merging, and store it as a tempfile
I open the other dataset, and merge the tempfile in.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------Passing local macros to another .do file
You can run a .do file called foo.do by typing do foo.do
You can also run that .do file by typing do foo.do something else
This will make the following local macros available at the beginning of foo.do:
`0' containing: something else
`1' containing: something
`2' containing: else
foo.do could do some complicated/fiddly manipulations of a variable
You want to apply those to multiple variable, say var1 and var2
Now you can create another .do file that contains the lines:
do foo.do var1 do foo.do var2
If you find an error in foo.do, you only have to fix it once, saving you a lot of potential to include inconsistencies and bugs.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------Leaving local macros behind
Say you have a .do file bar.do that calls a .do file foo.do
foo.do makes local macros that you want bar.do to have access to
Instead of the line do foo.do you can add the line include foo.do in bar.do
This will run foo.do as if it was actually part of bar.do. That way any local macros defined in foo.do will be available in bar.do
This is a good way to store settings that will be used in multiple .do files
A (dangerous) alternative is >> c_local.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- macros -------------------------------------------------------------------------------extended macro functions
extended macro functions are function you can use when defining a macro
. local varlab : var label mpg
. di "`varlab'" Mileage (mpg)
You can also use that on the fly
. di "`: var label price'" Price
The most useful are functions forextracting data attributes (labels, characteristics, variable types)
file names and file paths
formating results
manipulating lists
matrices
parsing strings
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- looping -------------------------------------------------------------------------------An example loop
. forvalues i = 1/10 { 2. di `i' 3. } 1 2 3 4 5 6 7 8 9 10
forvalues tells Stata that we want to loopthe i after forvalues is the name of a local macro that will exist when the loop runs
= 1/10 tells Stata the values the local i should take: 1 for the first time, 2 for the second time, ..., 10 for the tenth time, and then it stops.
{ and }: whatever is between these braces is going to be repeated
di `i' displays the content of the local macro i. So the first time it evaluates to di 1, the second time to di 2, etc.
previously we talked about a .do file foo.do that can be used to manipulate var1 and var2. What if we have var1 till var50?
forvalues i = 1/50 { do foo.do var`i' }
before Stata executes a line it first replaces macros with their content.
So the first time around it sees do foo.do var1, the second time around it sees do foo.do var2, etc.
In this case it is necessary that there is no space between var and `i'
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- looping -------------------------------------------------------------------------------different types of loops
forvalues i = 2(2)100
We loop over the values 2, 4, 6, ..., 100
This is the type of loop I use most
foreach var of varlist *
We loop over all variables in the dataset
This is useful for specific list like varlists, numlist, or locals
while `diff' > 1e-6
`diff' is probably a tempname for a scalar, and we continue the loop till this scalar is less than 1e-6 (0.0000001)
This is usful for when you want to iteratively optimise something. Often there are beter suits of commands for that, e.g. ml, nl, gmm, so that is very rarely used.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- looping -------------------------------------------------------------------------------Try it yourself
The variables weight length and turn all measure the size of the car. Maybe we want to combine those variables into one. In order to do so we need to make sure the are measured in the same unit.
One possiblity would be the percentile score; the proportion of cars that is smaller. Here is how I would do this for weight
. egen i = rank(weight)
. count if !missing(weight) 74
. gen p_weight = (i - .5)/r(N)
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- prefixes -------------------------------------------------------------------------------by
Say we want a new variable containing the mean price for every level of repair status
We could do this with a loop
. levelsof rep78 1 2 3 4 5
. local levs = r(levels)
. gen mprice = . (74 missing values generated)
. foreach lev of local levs { 2. sum price if rep78 == `lev', meanonly 3. replace mprice = r(mean) if rep78 == `lev' 4. } (2 real changes made) (8 real changes made) (30 real changes made) (18 real changes made) (11 real changes made)
Alternatively we could use the by prefix
. bysort rep78 : egen mprice2 = mean(price)
We could use this to find the highest price within each level of repair status
. bys rep78 (price) : gen maxprice = price[_N]
What would happen if price contained missing values?
. gen misprice = missing(price)
. bys rep78 misprice (price): gen maxprice2 = price[_N] if misprice == 0
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- prefixes -------------------------------------------------------------------------------statsby
The statsby prefix allows you to execute a command for each level of a variable and store returned statistics in a dataset.
This is often useful for creating graphs of summary statistics
. sysuse nlsw88, clear (NLSW, 1988 extract)
. statsby m=r(mean) min=r(min) max=r(max) p75=r(p75) p25=r(p25) , /// > by(industry) clear: sum wage, d (running summarize on estimation sample)
command: summarize wage, d m: r(mean) min: r(min) max: r(max) p75: r(p75) p25: r(p25) by: industry
Statsby groups ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 ............
. list
+---------------------------------------------------------------------+ 1. | industry | m | min | max | p75 | | Ag/Forestry/Fisheries | 5.621121 | 1.811594 | 12.38325 | 7.589398 | |---------------------------------------------------------------------| | p25 | | 3.454104 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 2. | industry | m | min | max | p75 | | Mining | 15.34959 | 5.016723 | 40.19808 | 24.93801 | |---------------------------------------------------------------------| | p25 | | 5.761177 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 3. | industry | m | min | max | p75 | | Construction | 7.564934 | 2.801002 | 30.19324 | 8.260865 | |---------------------------------------------------------------------| | p25 | | 4.830918 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 4. | industry | m | min | max | p75 | | Manufacturing | 7.501578 | 1.004952 | 40.19808 | 8.872785 | |---------------------------------------------------------------------| | p25 | | 4.508855 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 5. | industry | m | min | max | p75 | | Transport/Comm/Utility | 11.44335 | 3.526568 | 40.19808 | 12.11755 | |---------------------------------------------------------------------| | p25 | | 8.22866 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 6. | industry | m | min | max | p75 | | Wholesale/Retail Trade | 6.125896 | 2.012882 | 40.19808 | 6.76328 | |---------------------------------------------------------------------| | p25 | | 3.349436 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 7. | industry | m | min | max | p75 | | Finance/Ins/Real Estate | 9.843174 | 1.501798 | 40.19808 | 10.40257 | |---------------------------------------------------------------------| | p25 | | 5.233495 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 8. | industry | m | min | max | p75 | | Business/Repair Svc | 7.51579 | 1.571983 | 40.19808 | 9.462362 | |---------------------------------------------------------------------| | p25 | | 3.718949 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 9. | industry | m | min | max | p75 | | Personal Services | 4.401093 | 1.151368 | 22.97034 | 5.442833 | |---------------------------------------------------------------------| | p25 | | 3.001791 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 10. | industry | m | min | max | p75 | | Entertainment/Rec Svc | 6.724409 | 1.811594 | 13.17229 | 10.06441 | |---------------------------------------------------------------------| | p25 | | 3.220612 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 11. | industry | m | min | max | p75 | | Professional Services | 7.871186 | 1.032247 | 40.74659 | 9.798708 | |---------------------------------------------------------------------| | p25 | | 4.64573 | +---------------------------------------------------------------------+
+---------------------------------------------------------------------+ 12. | industry | m | min | max | p75 | | Public Administration | 9.148407 | 2.093397 | 40.19808 | 10.83736 | |---------------------------------------------------------------------| | p25 | | 6.352656 | +---------------------------------------------------------------------+
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- finding and manipulating groups of variables -------------------------------------------------------------------------------finding variables
Most datasets contain a great many variables. Finding the variables you are looking for can be a challenge.
The lookfor command can be useful, it allows you to search for variables with a specific string in their name or labels
. sysuse auto, clear (1978 Automobile Data)
. lookfor repair
storage display value variable name type format label variable label ------------------------------------------------------------------------------- rep78 int %8.0g Repair Record 1978
. return list
macros: r(varlist) : "rep78"
Say we want a list of all numeric variablesThis is something ds can do
. ds, has(type numeric) price headroom length gear_ratio mpg trunk turn foreign rep78 weight displacement
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- numerical precision -------------------------------------------------------------------------------binary versus decimal
If a computer stored numbers in decimal format we would not be surprised that we could not store the number 1/3 exactly; we would have to stop storing 3s otherwise we would need an infinite amount of memory to store one number.
A computer however stores numbers in binary format. In binary some numbers we would not consider problematic, are actualy like 1/3. The most common example is 0.1.
So a lot of numbers we think are perfectly "normal" are in a computer actually rounded versions of that number.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- numerical precision -------------------------------------------------------------------------------how a number is stored
So how are numbers stored?
We could say that we store a number up to 6 digits after the decimal point (ignoring that they are actually stored in binary)?
This is problematic
We would store the number 1,000,000 with 13 significant digits
While we would store the number 0.0001 with only 3 significant digits
Instead a number is stored in three parts: the sign and two numbers, lets call them a and b
If we would store the number in decimal format the number stored would then be sign * a * 10^b
So if we decided on 6 significant digits we would store the number 1,000,000 as +1*1,00000*10^6 and the number 0.0001 as +1*1.00000*10^-4
In real computers both a and b are binary numbers and we don't use 10^b, but 2^b
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- numerical precision -------------------------------------------------------------------------------rounding errors: adding and subtracting
This way of storing number allows us to reliably store number within a very large range.
It has some quirks, for example adding numbers that differ by a large order of magnitude can lead to quite large rounding errors.
We want to add 1,000,000 and 0.0001
Then we are adding +1*1,00000*10^6 and +1*1.00000*10^-4
In order to add them we would need to the exponent the same:
We would change +1*1.00000*10^-4 to +1*0.00000000001*10^6
However, we only stored 6 digits, so 0.00000000001 gets rounded to 0
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- numerical precision -------------------------------------------------------------------------------storage types
The problem with precision mainly occurs with fractional numbers, integers can be stored exactly as long as they are not too big
There are three storage types for integers that differ with respect to the range of numbers they can store and the amount of memory needed to store them
byte, which can store numbers between -127 and 100 and uses only one byte per number
int , which can store numbers between -32,767 and 32,740 and uses two bytes per number
long which can store numbers between -2,147,483,647 and 2,147,483,620 and uses four bytes per number
There are two storage types for fractional numbers
float has a precision of about 8 decimal digits and takes four bytes per number
double has a precision of about 16 decimal digits and takes 8 bytes per number. A double can also store the largest range of numbers: between -8.988*10^307 and 8.988*10^307.
The default storage type for variables is float.
This is fine for storing data. We typically don't think we measured our variables upto 8 digits accurate.
Say we ask someone's income. A respondent does not have her or his income exactly in memory, he or she will round when answering that question. We can probably trust the first two, maybe three, digits. So storing that with about 8 digits accuracy is more than enough.
A float is probably not optimal for storing intermediate results of computations. We want to minimise the rounding errors that happen at each step, and if we store them as floats they can quickly add up. So for intermediate results in computations you are better of using doubles.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- numerical precision -------------------------------------------------------------------------------possible problems
logical statements with fractional numbers
Say we want to summarize a variable x for only the observation for which var == 0.1
We type sum x if var == 0.1 and Stata will tell you that no observations meet that criterium, while if you look at the data you see several such observations.
var is probably stored as a float, but Stata does computations in double precision
So Stata is comparing the float(0.1) from var to the double(0.1) in the expression and finds that they are not equal.
The solution is to really don't do equality checks on fractional numbers.
Storing larger numbers that need to be stored exactly
This is very common for ID variables. Say the first two digits stand for the country, the next two digits for the privince, the next three digits for the city, the next 4 digits for the household, the next two digits for the person, and the next two digits for the wave.
Now we have 15 digits, and a float cannot store that. A double can.
If we have even larger numbers we can store them as strings.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- programming -------------------------------------------------------------------------------defining a program in a .do file
. program drop _all
. program define hello 1. di "hello world" 2. end
. hello hello world
A program starts with program and ends with endWhatever is in between is anything that you could also include in a regular .do file.
So now you can write Stata programs.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Tips and tricks for working with Stata -- programming -------------------------------------------------------------------------------Why would you want to do that?
The most common application would be to use it to bootstrap some statistic
Occationally I do that to automate some extremely fiddly tasks that I need to do repeatedly
Consider the following example
. sysuse nlsw88,clear (NLSW, 1988 extract)
. reg wage c.ttl_exp##c.ttl_exp grade
Source | SS df MS Number of obs = 2,244 -------------+---------------------------------- F(3, 2240) = 129.89 Model | 11018.1157 3 3672.70523 Prob > F = 0.0000 Residual | 63336.2148 2,240 28.2750959 R-squared = 0.1482 -------------+---------------------------------- Adj R-squared = 0.1470 Total | 74354.3305 2,243 33.1495009 Root MSE = 5.3174
------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ttl_exp | .3148533 .106226 2.96 0.003 .1065415 .523165 | c.ttl_exp#| c.ttl_exp | -.0022075 .0042817 -0.52 0.606 -.0106039 .006189 | grade | .6455095 .0457626 14.11 0.000 .5557678 .7352511 _cons | -4.238754 .7752557 -5.47 0.000 -5.759049 -2.718459 ------------------------------------------------------------------------------
. nlcom -_b[ttl_exp]/(2*_b[c.ttl_exp#c.ttl_exp])
_nl_1: -_b[ttl_exp]/(2*_b[c.ttl_exp#c.ttl_exp])
------------------------------------------------------------------------------ wage | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _nl_1 | 71.31548 115.0695 0.62 0.535 -154.2166 296.8476 ------------------------------------------------------------------------------
So the maximum wage is obtained after 71 years of experience with a huge confidence intervalWe may not trust that confidence interval as nlcom assumes that the sampling distribution is normally distributed, and our statistic is something divided by a number that could easily be 0. So we an expect strange things to happen, and to be sure we would like to bootstrap this.
. program drop _all
. sysuse nlsw88 (NLSW, 1988 extract)
. program define toboot, rclass 1. version 14 2. syntax [if] 3. marksample touse 4. reg wage c.ttl_exp##c.ttl_exp grade if `touse' 5. return scalar max = -_b[ttl_exp]/(2*_b[c.ttl_exp#c.ttl_exp]) 6. end
. toboot
Source | SS df MS Number of obs = 2,244 -------------+---------------------------------- F(3, 2240) = 129.89 Model | 11018.1157 3 3672.70523 Prob > F = 0.0000 Residual | 63336.2148 2,240 28.2750959 R-squared = 0.1482 -------------+---------------------------------- Adj R-squared = 0.1470 Total | 74354.3305 2,243 33.1495009 Root MSE = 5.3174
------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ttl_exp | .3148533 .106226 2.96 0.003 .1065415 .523165 | c.ttl_exp#| c.ttl_exp | -.0022075 .0042817 -0.52 0.606 -.0106039 .006189 | grade | .6455095 .0457626 14.11 0.000 .5557678 .7352511 _cons | -4.238754 .7752557 -5.47 0.000 -5.759049 -2.718459 ------------------------------------------------------------------------------
. return list
scalars: r(max) = 71.31548299848835
. . bootstrap max=r(max), reps(100) bca : toboot (running toboot on estimation sample)
Jackknife replications (2244) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 .................................................. 250 .................................................. 300 .................................................. 350 .................................................. 400 .................................................. 450 .................................................. 500 .................................................. 550 .................................................. 600 .................................................. 650 .................................................. 700 .................................................. 750 .................................................. 800 .................................................. 850 .................................................. 900 .................................................. 950 .................................................. 1000 .................................................. 1050 .................................................. 1100 .................................................. 1150 .................................................. 1200 .................................................. 1250 .................................................. 1300 .................................................. 1350 .................................................. 1400 .................................................. 1450 .................................................. 1500 .................................................. 1550 .................................................. 1600 .................................................. 1650 .................................................. 1700 .................................................. 1750 .................................................. 1800 .................................................. 1850 .................................................. 1900 .................................................. 1950 .................................................. 2000 .................................................. 2050 .................................................. 2100 .................................................. 2150 .................................................. 2200 ............................................
Bootstrap replications (100) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100
Bootstrap results Number of obs = 2,244 Replications = 100
command: toboot max: r(max)
------------------------------------------------------------------------------ | Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- max | 71.31548 472.9631 0.15 0.880 -855.6752 998.3061 ------------------------------------------------------------------------------
. estat bootstrap, bca
Bootstrap results Number of obs = 2,244 Replications = 100
command: toboot max: r(max)
------------------------------------------------------------------------------ | Observed Bootstrap | Coef. Bias Std. Err. [95% Conf. Interval] -------------+---------------------------------------------------------------- max | 71.315483 -97.31549 472.96311 38.58707 799.4429 (BCa) ------------------------------------------------------------------------------ (BCa) bias-corrected and accelerated confidence interval
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Goal
We will write a program htmlcodebook that creates a codebook in html format for a specified Stata dataset.
The codebook will contain:
The name of the file with data label and notes
A list of variables with label and notes
A desription of each variables: a table if there are value labels or the number of distinct values is small, summary statistics for unlabeled numeric variables with more values, and some examples for string variables.
. htmlcodebook using arc06.dta, saving(test.html) replace Output written to test.html
Such a program is not written in one go, but this happens in a large series of small steps.I have created a large number of small exercises that simulate that process in a guided way.
These exercises will cover a large part material discussed before.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Starting
clear all
program define htmlcodebook
version 14
syntax , SAVing(string)
tempname book
file open `book' using `saving', write
file write `book' "<!DOCTYPE html>"_n
file write `book' "<html>"_n
file write `book' "<body>"
file write `book' "<h1>title</h1>"_n
file write `book' "</body>" _n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
end
cd h:\stata_l2
htmlcodebook, saving(test.html)
type test.html
We start working in a .do file, and we start small. A .html file with only one line: "title".
It shows us what a program looks like, and how to write a file with Stata using the file command.
It is important that we start our .do file with program drop _all. We are definig a program, and Stata will return an error message if it already exists in memory.
The version command makes sure that this program will continue to work in future versions of Stata.
We specify the possible options with the syntax command.
With the file command we can write a file.
The first line says what kind of html file this is
The second and last line tells when the html file begins and ends.
The third and second last line tells when the body of the html file begins and ends.
The fourth line appears is displayed in the html file: it is "title" displayed as a heading 1.
After we created the html file we let Stata display a link that will open it in a browser.
We can see the file in plain text with the type command.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Add some lines to our .html file to make it prettier
We can make the html file look prettier by including the following lines after <html> and before <body>.
<style> body { width: 650px; margin: auto; } </style>
It limits the width of the output and positions it in the middle of the screen.
Change the program to make it include those lines at the appropriate place.
htmlcodebook01.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------add a replace option
If we run our .do file multiple times, than the file test.html is created multiple times.
Stata never overwrites something unless you explicitly say so, i.e. specify the replace option in the file command.
We want to copy that behavior: i.e.
add a replace option to our htmlcodebook command, and
only specify the replace option in file, when the user specified the replace option in htmlcodebook.
Look in help syntax to find out how to implement on/off options (the user specified or did not specify the replace option).
htmlcodebook02.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Let the user specify a meaningful title
"Title" is not a very meaningful title, we should add an option that allows the user to specify its own title
Such an option would ask for a string, see help syntax on how to implement such an option
htmlcodebook03.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------What should the title be when the title() option is not specified?
If the user did not specify the title option then we need to make a sensible choice. I suggest "Codebook for" and than the file the user specified in using
See help ifcmd on how to implement different things depending on whether or not an option was specified.
htmlcodebook04.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Add a list of variable names to our codefile
clear all
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
qui use "`using'", clear
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
file write `book' "<h3>Variable list</h3>"_n //new
file write `book' "<ul>"_n //new
foreach var of varlist * { //new
file write `book' "<li>`var'</li>"_n //new
} //new
file write `book' "</ul>" //new
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
At the very least a codebook should contain a list of variable names
In the .do file we can see how we can create a list in html using the <ul> , </ul>, <li>, and </li> tags.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Adding varialble labels to our variable list
Change this program such that
if there is a variable label then the variable list shows variable name : variable lable
otherwise just the variable name
Use extended macro functions find the variable label (see help extended_fcn)
Notice that that returns the variable name if no variable label is attached to that variable
htmlcodebook06.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Add data label to codebook
Use extended macro functions to find the label belonging to the dataset.
If such a label exists, add it to the codebook underneath the title with <h2> and </h2> tags.
htmlcodebook07.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Add data notes to codebook
Notes are stored as characteristics, which can be accessed using the `: char' extended macro function.
The notes are named _dta[note1], _dta[note2], etc.
How many notes there are is stored in _dta[note0]
This is empty ("") if no notes were specified
Or contains the number of notes, when one or more notes exist
Add a list of data notes underneath the data label if such notes exist
htmlcodebook08.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Splitting the program up in smaller subroutines
clear all
program define Descfile //new
syntax using/, book(string) //new
//new
qui use "`using'", clear //new
//new
if `"`: data label'"' != "" { //new
file write `book' `"<h2>`:data label'</h2>"' _n //new
} //new
if "`: char _dta[note0]'" != "" { //new
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n //new
forvalues i = 1/`: char _dta[note0]' { //new
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n //new
} //new
file write `book' "</ul>"_n //new
} //new
end //new
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book') //new
file write `book' "<h3>Variable list</h3>"_n
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
}
file write `book' "</ul>"
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
Our program becomes bigger and bigger. To keep an overview, and make it easier to spot errors and maintain the program it is best to split the program up into different subroutines, each for a specific task.
Here we create a subroutine for describing a file
StataCorp often starts subroutines with capital leters, and I have copied that convention.
Notice that the handle for our codebook, `book', was a tempname, so local to the program that created it.
To solve this we define the handle in the main program.
This means that it exists as long as the main program is being executed, even if it call another program, like our subroutine.
That way we can pass that temporary name on to sub-routines as options
Use this trick to create another sub-routine for describing the variables, call it Descvars
htmlcodebook10.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Add variable notes to the variable list
Change Descvars such that notes belonging to that variable are displayed if they exist.
htmlcodebook11.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Add a list of value labels to our codebook
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" {
file write `book' "<ul>" _n
forvalues i = 1/`: char `var'[note0]'{
file write `book' `"<li>`: char `var'[note`i']'</li>"'
}
file write `book' "</ul>" _n
}
}
file write `book' "</ul>"
end
program define Descvalues //new
syntax, book(string) file(string) //new
//new
foreach var of varlist * { //new
local vallab : val label `var' //new
if "`vallab'" != "" { //new
file write `book' `"<h4>`var'"' //new
local varlab : var label `var' //new
if `"`varlab'"' != "`var'" { //new
file write `book' `": `varlab'"' //new
} //new
file write `book' "</h4>"_n //new
qui uselabel `vallab' //new
file write `book' "<table>"_n //new
forvalues i = 1/`=_N' { //new
file write `book' "<tr>" //new
file write `book' `"<td>`=value[`i']'</td>"' //new
file write `book' `"<td>`=label[`i']'</td>"' //new
file write `book' "</td>"_n //new
} //new
file write `book' "</table>"_n //new
qui use `file', clear //new
} //new
} //new
//new
end //new
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
Descvalues, book(`book') file(`using') //new
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
Here we add for each variable a list of value labels to our codebook if value labels exist
It uses the uselabel command, which stores value labels as a new dataset
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Add frequencies to our labels
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" {
file write `book' "<ul>" _n
forvalues i = 1/`: char `var'[note0]'{
file write `book' `"<li>`: char `var'[note`i']'</li>"'
}
file write `book' "</ul>" _n
}
}
file write `book' "</ul>"
end
program define Descvalues
syntax, book(string) file(string)
foreach var of varlist * {
local vallab : val label `var'
if "`vallab'" != "" {
file write `book' `"<h4>`var'"'
local varlab : var label `var'
if `"`varlab'"' != "`var'" {
file write `book' `": `varlab'"'
}
file write `book' "</h4>"_n
tempvar freq //new
tempfile freqtable //new
contract `var', freq(`freq') //new
rename `var' value //new
qui save `freqtable' //new
qui uselabel `vallab'
qui merge 1:1 value using `freqtable' //new
file write `book' "<table>"_n
file write `book' "<tr>" //new
file write `book' "<th>value</th>" //new
file write `book' "<th>label</th>" //new
file write `book' "<th>frequency</th>" //new
file write `book' "</tr>"_n //new
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=value[`i']'</td>"'
file write `book' `"<td>`=label[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"' //new
file write `book' "</td>"_n
}
file write `book' "</table>"_n
qui use `file', clear
}
}
end
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
Descvalues, book(`book') file(`using')
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
contract changes the data in one observation per value of `var' and a new variable `freq' that contains the number of observations with that value
We merge that with the file created by uselabel
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Numeric variables without value labels
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" {
file write `book' "<ul>" _n
forvalues i = 1/`: char `var'[note0]'{
file write `book' `"<li>`: char `var'[note`i']'</li>"'
}
file write `book' "</ul>" _n
}
}
file write `book' "</ul>"
end
program define Descvalues
syntax, book(string) file(string)
local N = _N
foreach var of varlist * {
file write `book' `"<h4>`var'"'
local varlab : var label `var'
if `"`varlab'"' != "`var'" {
file write `book' `": `varlab'"'
}
file write `book' "</h4>"_n
tempvar freq
contract `var', freq(`freq')
local vallab : val label `var'
if "`vallab'" != "" {
tempfile freqtable
rename `var' value
qui save `freqtable'
qui uselabel `vallab'
qui merge 1:1 value using `freqtable'
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>label</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=value[`i']'</td>"'
file write `book' `"<td>`=label[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
file write `book' "</table>"_n
}
else { //new
if _N <= 10 { //new
file write `book' "<table>"_n //new
file write `book' "<tr>" //new
file write `book' "<th>value</th>" //new
file write `book' "<th>frequency</th>" //new
file write `book' "</tr>"_n //new
forvalues i = 1/`=_N' { //new
file write `book' "<tr>" //new
file write `book' `"<td>`=`var'[`i']'</td>"' //new
file write `book' `"<td>`=`freq'[`i']'</td>"' //new
file write `book' "</td>"_n //new
} //new
} //new
capture confirm numeric variable `var' //new
else if !_rc { //new
file write `book' "<table>" _n //new
qui count if !missing(`var') //new
local distinct = r(N) //new
qui sum `var' [fw=`freq'], detail //new
file write `book' "<tr>"_n //new
file write `book' `"<th>valid/missing obs.</th>"' //new
file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"' //new
file write `book' "</tr>"_n //new
file write `book' "<tr>" //new
file write `book' "<th>distinct values</th>" //new
file write `book' "<td>`distinct'</td>" //new
file write `book' "</tr>"_n //new
file write `book' "<tr>" //new
file write `book' "<th>minimum</th>" //new
file write `book' "<td>`r(min)'</td>" //new
file write `book' "</tr>"_n //new
file write `book' "</table>" _n //new
} //new
} //new
qui use `file', clear
}
end
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
Descvalues, book(`book') file(`using')
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------display format for summary statistics
I am not quite happy with the way the minimum is displayed.
Can't we copy the display format for that variable?
The display format for a variable can be extracted using extended macro functions
You can apply that format using the `: display ' extended macro function
Change the program to achieve this goal
htmlcodebook15.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Add other summary statistics
Expand our program to also incude the 25th, 50th, 75th percentiles and the maximum.
htmlcodebook16.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------String variables
We also want to show something for string variables.
Lets show the first 10 distinct values as an example
Change the program to achive that goal
htmlcodebook17.do
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Add links between the variable list and the value label list
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' `"<li><a href="#l`var'">`var'</a>"' //new
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" {
file write `book' "<ul>" _n
forvalues i = 1/`: char `var'[note0]'{
file write `book' `"<li>`: char `var'[note`i']'</li>"'
}
file write `book' "</ul>" _n
}
}
file write `book' "</ul>"
end
program define Descvalues
syntax, book(string) file(string)
local N = _N
foreach var of varlist * {
file write `book' `"<h4 id="l`var'">`var'"' //new
local varlab : var label `var'
if `"`varlab'"' != "`var'" {
file write `book' `": `varlab'"'
}
file write `book' "</h4>"_n
tempvar freq
contract `var', freq(`freq')
local vallab : val label `var'
if "`vallab'" != "" {
tempfile freqtable
rename `var' value
qui save `freqtable'
qui uselabel `vallab'
qui merge 1:1 value using `freqtable'
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>label</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=value[`i']'</td>"'
file write `book' `"<td>`=label[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
file write `book' "</table>"_n
}
else {
if _N <= 10 {
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=`var'[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
}
capture confirm numeric variable `var'
else if !_rc {
file write `book' "<table>" _n
qui count if !missing(`var')
local distinct = r(N)
local fmt : format `var'
qui sum `var' [fw=`freq'], detail
file write `book' "<tr>"_n
file write `book' `"<th>valid/missing obs.</th>"'
file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>distinct values</th>"
file write `book' "<td>`distinct'</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>minimum</th>"
file write `book' "<td>`:display `fmt' `r(min)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>25th percentile</th>"
file write `book' "<td>`:display `fmt' `r(p25)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>50th percentile</th>"
file write `book' "<td>`:display `fmt' `r(p50)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>75th percentile</th>"
file write `book' "<td>`:display `fmt' `r(p75)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>maximum</th>"
file write `book' "<td>`:display `fmt' `r(max)''</td>"
file write `book' "</tr>"_n
file write `book' "</table>" _n
}
else {
file write `book' "<p>Variable `var' is a string variable,"
file write `book' "example values are:</p>"_n
file write `book' "<ul>"_n
forvalues i = 1/10 {
file write `book' `"<li>`=`var'[`i']'</li>"'_n
}
file write `book' "</ul>"_n
}
}
qui use `file', clear
}
end
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
Descvalues, book(`book') file(`using')
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
Descvalues creates markers, and in Descvars it adds links to those markers
That way users can now jump from the list of variables to the list of value labels.
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' `"<li id="v`var'"><a href="#l`var'">`var'</a>"' //new
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" {
file write `book' "<ul>" _n
forvalues i = 1/`: char `var'[note0]'{
file write `book' `"<li>`: char `var'[note`i']'</li>"'
}
file write `book' "</ul>" _n
}
}
file write `book' "</ul>"
end
program define Descvalues
syntax, book(string) file(string)
local N = _N
foreach var of varlist * {
file write `book' `"<h4 id="l`var'"><a href="#v`var'">`var'</a>"' //new
local varlab : var label `var'
if `"`varlab'"' != "`var'" {
file write `book' `": `varlab'"'
}
file write `book' "</h4>"_n
tempvar freq
contract `var', freq(`freq')
local vallab : val label `var'
if "`vallab'" != "" {
tempfile freqtable
rename `var' value
qui save `freqtable'
qui uselabel `vallab'
qui merge 1:1 value using `freqtable'
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>label</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=value[`i']'</td>"'
file write `book' `"<td>`=label[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
file write `book' "</table>"_n
}
else {
if _N <= 10 {
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=`var'[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
}
capture confirm numeric variable `var'
else if !_rc {
file write `book' "<table>" _n
qui count if !missing(`var')
local distinct = r(N)
local fmt : format `var'
qui sum `var' [fw=`freq'], detail
file write `book' "<tr>"_n
file write `book' `"<th>valid/missing obs.</th>"'
file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>distinct values</th>"
file write `book' "<td>`distinct'</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>minimum</th>"
file write `book' "<td>`:display `fmt' `r(min)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>25th percentile</th>"
file write `book' "<td>`:display `fmt' `r(p25)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>50th percentile</th>"
file write `book' "<td>`:display `fmt' `r(p50)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>75th percentile</th>"
file write `book' "<td>`:display `fmt' `r(p75)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>maximum</th>"
file write `book' "<td>`:display `fmt' `r(max)''</td>"
file write `book' "</tr>"_n
file write `book' "</table>" _n
}
else {
file write `book' "<p>Variable `var' is a string variable,"
file write `book' "example values are:</p>"_n
file write `book' "<ul>"_n
forvalues i = 1/10 {
file write `book' `"<li>`=`var'[`i']'</li>"'_n
}
file write `book' "</ul>"_n
}
}
qui use `file', clear
}
end
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
Descvalues, book(`book') file(`using')
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
Descvars creates markers, and Descvalues adds links to those markers
that way users can now jump from the list of value labels back to the
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' `"<li id="v`var'"><a href="#l`var'">`var'</a>"'
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" {
file write `book' "<ul>" _n
forvalues i = 1/`: char `var'[note0]'{
file write `book' `"<li>`: char `var'[note`i']'</li>"'
}
file write `book' "</ul>" _n
}
}
file write `book' "</ul>"
end
program define Descvalues
syntax, book(string) file(string)
local N = _N
foreach var of varlist * {
file write `book' `"<h4 id="l`var'"><a href="#v`var'">`var'</a>"'
local varlab : var label `var'
if `"`varlab'"' != "`var'" {
file write `book' `": `varlab'"'
}
file write `book' "</h4>"_n
tempvar freq
contract `var', freq(`freq')
local vallab : val label `var'
if "`vallab'" != "" {
tempfile freqtable
rename `var' value
qui save `freqtable'
qui uselabel `vallab'
qui merge 1:1 value using `freqtable'
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>label</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=value[`i']'</td>"'
file write `book' `"<td>`=label[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
file write `book' "</table>"_n
}
else {
if _N <= 10 {
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=`var'[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
}
capture confirm numeric variable `var'
else if !_rc {
file write `book' "<table>" _n
qui count if !missing(`var')
local distinct = r(N)
local fmt : format `var'
qui sum `var' [fw=`freq'], detail
file write `book' "<tr>"_n
file write `book' `"<th>valid/missing obs.</th>"'
file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>distinct values</th>"
file write `book' "<td>`distinct'</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>minimum</th>"
file write `book' "<td>`:display `fmt' `r(min)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>25<sup>th</sup> percentile</th>" //new
file write `book' "<td>`:display `fmt' `r(p25)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>50<sup>th</sup> percentile</th>" //new
file write `book' "<td>`:display `fmt' `r(p50)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>75<sup>th</sup> percentile</th>" //new
file write `book' "<td>`:display `fmt' `r(p75)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>maximum</th>"
file write `book' "<td>`:display `fmt' `r(max)''</td>"
file write `book' "</tr>"_n
file write `book' "</table>" _n
}
else {
file write `book' "<p>Variable `var' is a string variable,"
file write `book' "example values are:</p>"_n
file write `book' "<ul>"_n
forvalues i = 1/10 {
file write `book' `"<li>`=`var'[`i']'</li>"'_n
}
file write `book' "</ul>"_n
}
}
qui use `file', clear
}
end
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' " width: 650px;"_n
file write `book' " margin: auto;"_n
file write `book' "}"_n
file write `book' "table {"_n //new
file write `book' " border-collapse: collapse;"_n //new
file write `book' "}"_n //new
file write `book' "table, th, td {"_n //new
file write `book' " border: 1px solid black;"_n //new
file write `book' "}"_n //new
file write `book' "th, td {"_n //new
file write `book' " text-align: left;"_n //new
file write `book' " padding-right: 10px;"_n //new
file write `book' " padding-left: 10px;"_n //new
file write `book' "}"_n //new
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
Descvalues, book(`book') file(`using')
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------Turn this into an .ado file
To turn this into an .ado file all we have to do is
remove the commands outside the programs.
move the main program htmlcodebook to the top of the file
Add a comment *! version 0.1.0 24Apr2018 MLB, this is displayed when you type which htmlcodebook
Store this under htmlcodebook.ado, the name of the file has to correspond with the name of the first program
All other programs are local to the main program. So other programs cannot see them.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- Application: writing your own program for creating a codebook in .html -------------------------------------------------------------------------------What is still left?
Some datasets consist of mulitple files, it would be nice to be able to specify a directory and let htmlcodebook create a codebook for all files in that directory.
No program is complete without a help file. We can use examplehelpfile and viewsource examplehelpfile.sthlp as a template.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------------------------------------------------------------------------------- digression -------------------------------------------------------------------------------c_local
With c_local you can create a local macro one level below.
So if bar.do calls foo.do, and foo.do contains the line c_local blup 1 then that will create a local macro in bar.do called `blup' containing 1
If you are later reading bar.do you have no indication that foo.do will create that local macro. So it can be very hard to later figure out where that local macro `blup' came from. This is why I labeled it as dangerous.
This is why it is not documented, and not even undocumented.
To quote Nick Cox: https://www.stata.com/statalist/archive/2005-11/msg00405.html:
-c_local- is not documented; it is not even "undocumented" (-help undocumented-). So, how does anyone outside StataCorp know about it?
What happens is this: after a long period of Stata use in which you have done well, Stata will speak to you:
"Greetings! You have reached the seventh level of Stata, and I name you Statafriend.
You will now be initiated into seven Stata secrets. The first is -c_local-."
and so forth, but the rest of it is probably not of interest.
Since this is only a second level Stata course you still have a way to go...
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook01.do
clear all
program define htmlcodebook
version 14
syntax , SAVing(string)
tempname book
file open `book' using `saving', write
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n //new
file write `book' "body {"_n //new
file write `book' "width: 650px;"_n //new
file write `book' "margin: auto;"_n //new
file write `book' "}"_n //new
file write `book' "</style>"_n //new
file write `book' "<body>"
file write `book' "<h1>title</h1>"_n
file write `book' "</body>" _n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
end
cd h:\stata_l2
htmlcodebook, saving(test.html)
type test.html
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook02.do
clear all
program define htmlcodebook
version 14
syntax , SAVing(string) [replace] //new
tempname book
file open `book' using `saving', write `replace' //new
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
file write `book' "<h1>title</h1>"_n
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
end
cd h:\stata_l2
htmlcodebook, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook03.do
clear all
program define htmlcodebook
version 14
syntax , SAVing(string) [replace title(string)] //new
tempname book
file open `book' using `saving', write `replace'
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" { //new
file write `book' "<h1>`title'</h1>"_n //new
} //new
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
end
cd h:\stata_l2
htmlcodebook, saving(test.html) title("foo") replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook04.do
clear all
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book data
file open `book' using `saving', write `replace'
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else { //new
file write `book' "<h1>Codebook for `using'</h1>"_n //new
} //new
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
end
cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook06.do
clear all
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
qui use "`using'", clear
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'" //new
if `"`: var label `var''"' != "`var'" { //new
file write `book' `": `: var label `var''"' //new
} //new
file write `book' "</li>"_n //new
}
file write `book' "</ul>"
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook07.do
clear all
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
qui use "`using'", clear
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
if `"`: data label'"' != "" { //new
file write `book' `"<h2>`:data label'</h2>"' _n //new
} //new
file write `book' "<h3>Variable list</h3>"_n
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
}
file write `book' "</ul>"
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook08.do
clear all
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
qui use "`using'", clear
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" { //new
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n //new
forvalues i = 1/`: char _dta[note0]' { //new
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n //new
} //new
file write `book' "</ul>"_n //new
} //new
file write `book' "<h3>Variable list</h3>"_n
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
}
file write `book' "</ul>"
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook10.do
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars //new
syntax, book(string) //new
//new
file write `book' "<h3>Variable list</h3>"_n //new
file write `book' "<ul>"_n //new
foreach var of varlist * { //new
file write `book' "<li>`var'" //new
if `"`: var label `var''"' != "`var'" { //new
file write `book' `": `: var label `var''"' //new
} //new
file write `book' "</li>"_n //new
} //new
file write `book' "</ul>" //new
end //new
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book') //new
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook11.do
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" { //new
file write `book' "<ul>" _n //new
forvalues i = 1/`: char `var'[note0]'{ //new
file write `book' `"<li>`: char `var'[note`i']'</li>"' //new
} //new
file write `book' "</ul>" _n //new
} //new
}
file write `book' "</ul>"
end
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook15.do
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" {
file write `book' "<ul>" _n
forvalues i = 1/`: char `var'[note0]'{
file write `book' `"<li>`: char `var'[note`i']'</li>"'
}
file write `book' "</ul>" _n
}
}
file write `book' "</ul>"
end
program define Descvalues
syntax, book(string) file(string)
local N = _N
foreach var of varlist * {
file write `book' `"<h4>`var'"'
local varlab : var label `var'
if `"`varlab'"' != "`var'" {
file write `book' `": `varlab'"'
}
file write `book' "</h4>"_n
tempvar freq
contract `var', freq(`freq')
local vallab : val label `var'
if "`vallab'" != "" {
tempfile freqtable
rename `var' value
qui save `freqtable'
qui uselabel `vallab'
qui merge 1:1 value using `freqtable'
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>label</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=value[`i']'</td>"'
file write `book' `"<td>`=label[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
file write `book' "</table>"_n
}
else {
if _N <= 10 {
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=`var'[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
}
capture confirm numeric variable `var'
else if !_rc {
file write `book' "<table>" _n
qui count if !missing(`var')
local distinct = r(N)
local fmt : format `var' //new
qui sum `var' [fw=`freq'], detail
file write `book' "<tr>"_n
file write `book' `"<th>valid/missing obs.</th>"'
file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>distinct values</th>"
file write `book' "<td>`distinct'</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>minimum</th>"
file write `book' "<td>`:display `fmt' `r(min)''</td>" //new
file write `book' "</tr>"_n
file write `book' "</table>" _n
}
}
qui use `file', clear
}
end
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
Descvalues, book(`book') file(`using')
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook16.do
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" {
file write `book' "<ul>" _n
forvalues i = 1/`: char `var'[note0]'{
file write `book' `"<li>`: char `var'[note`i']'</li>"'
}
file write `book' "</ul>" _n
}
}
file write `book' "</ul>"
end
program define Descvalues
syntax, book(string) file(string)
local N = _N
foreach var of varlist * {
file write `book' `"<h4>`var'"'
local varlab : var label `var'
if `"`varlab'"' != "`var'" {
file write `book' `": `varlab'"'
}
file write `book' "</h4>"_n
tempvar freq
contract `var', freq(`freq')
local vallab : val label `var'
if "`vallab'" != "" {
tempfile freqtable
rename `var' value
qui save `freqtable'
qui uselabel `vallab'
qui merge 1:1 value using `freqtable'
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>label</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=value[`i']'</td>"'
file write `book' `"<td>`=label[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
file write `book' "</table>"_n
}
else {
if _N <= 10 {
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=`var'[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
}
capture confirm numeric variable `var'
else if !_rc {
file write `book' "<table>" _n
qui count if !missing(`var')
local distinct = r(N)
local fmt : format `var'
qui sum `var' [fw=`freq'], detail
file write `book' "<tr>"_n
file write `book' `"<th>valid/missing obs.</th>"'
file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>distinct values</th>"
file write `book' "<td>`distinct'</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>minimum</th>"
file write `book' "<td>`:display `fmt' `r(min)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>" //new
file write `book' "<th>25th percentile</th>" //new
file write `book' "<td>`:display `fmt' `r(p25)''</td>" //new
file write `book' "</tr>"_n //new
file write `book' "<tr>" //new
file write `book' "<th>50th percentile</th>" //new
file write `book' "<td>`:display `fmt' `r(p50)''</td>" //new
file write `book' "</tr>"_n //new
file write `book' "<tr>" //new
file write `book' "<th>75th percentile</th>" //new
file write `book' "<td>`:display `fmt' `r(p75)''</td>" //new
file write `book' "</tr>"_n //new
file write `book' "<tr>" //new
file write `book' "<th>maximum</th>" //new
file write `book' "<td>`:display `fmt' `r(max)''</td>" //new
file write `book' "</tr>"_n //new
file write `book' "</table>" _n
}
}
qui use `file', clear
}
end
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
Descvalues, book(`book') file(`using')
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
htmlcodebook17.do
clear all
program define Descfile
syntax using/, book(string)
qui use "`using'", clear
if `"`: data label'"' != "" {
file write `book' `"<h2>`:data label'</h2>"' _n
}
if "`: char _dta[note0]'" != "" {
file write `book' "<h3>Notes:</h3>"_n"<ul>"_n
forvalues i = 1/`: char _dta[note0]' {
file write `book' `"<li>`: char _dta[note`i']'</li>"' _n
}
file write `book' "</ul>"_n
}
end
program define Descvars
syntax, book(string)
file write `book' "<h3>Variable list</h3>"
file write `book' "<ul>"_n
foreach var of varlist * {
file write `book' "<li>`var'"
if `"`: var label `var''"' != "`var'" {
file write `book' `": `: var label `var''"'
}
file write `book' "</li>"_n
if "`: char `var'[note0]'" != "" {
file write `book' "<ul>" _n
forvalues i = 1/`: char `var'[note0]'{
file write `book' `"<li>`: char `var'[note`i']'</li>"'
}
file write `book' "</ul>" _n
}
}
file write `book' "</ul>"
end
program define Descvalues
syntax, book(string) file(string)
local N = _N
foreach var of varlist * {
file write `book' `"<h4>`var'"'
local varlab : var label `var'
if `"`varlab'"' != "`var'" {
file write `book' `": `varlab'"'
}
file write `book' "</h4>"_n
tempvar freq
contract `var', freq(`freq')
local vallab : val label `var'
if "`vallab'" != "" {
tempfile freqtable
rename `var' value
qui save `freqtable'
qui uselabel `vallab'
qui merge 1:1 value using `freqtable'
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>label</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=value[`i']'</td>"'
file write `book' `"<td>`=label[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
file write `book' "</table>"_n
}
else {
if _N <= 10 {
file write `book' "<table>"_n
file write `book' "<tr>"
file write `book' "<th>value</th>"
file write `book' "<th>frequency</th>"
file write `book' "</tr>"_n
forvalues i = 1/`=_N' {
file write `book' "<tr>"
file write `book' `"<td>`=`var'[`i']'</td>"'
file write `book' `"<td>`=`freq'[`i']'</td>"'
file write `book' "</td>"_n
}
}
capture confirm numeric variable `var'
else if !_rc {
file write `book' "<table>" _n
qui count if !missing(`var')
local distinct = r(N)
local fmt : format `var'
qui sum `var' [fw=`freq'], detail
file write `book' "<tr>"_n
file write `book' `"<th>valid/missing obs.</th>"'
file write `book' `"<td>`r(N)'/`=`N'-`r(N)'' </td>"'
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>distinct values</th>"
file write `book' "<td>`distinct'</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>minimum</th>"
file write `book' "<td>`:display `fmt' `r(min)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>25th percentile</th>"
file write `book' "<td>`:display `fmt' `r(p25)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>50th percentile</th>"
file write `book' "<td>`:display `fmt' `r(p50)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>75th percentile</th>"
file write `book' "<td>`:display `fmt' `r(p75)''</td>"
file write `book' "</tr>"_n
file write `book' "<tr>"
file write `book' "<th>maximum</th>"
file write `book' "<td>`:display `fmt' `r(max)''</td>"
file write `book' "</tr>"_n
file write `book' "</table>" _n
}
else { //new
file write `book' "<p>Variable `var' is a string variable," //new
file write `book' "example values are:</p>"_n //new
file write `book' "<ul>"_n //new
forvalues i = 1/10 { //new
file write `book' `"<li>`=`var'[`i']'</li>"'_n //new
} //new
file write `book' "</ul>"_n //new
} //new
}
qui use `file', clear
}
end
program define htmlcodebook
version 14
syntax using/ , SAVing(string) [replace title(string)]
tempname book
file open `book' using `saving', write `replace'
preserve
file write `book' "<!DOCTYPE html>"_n"<html>"_n
file write `book' "<style>"_n
file write `book' "body {"_n
file write `book' "width: 650px;"_n
file write `book' "margin: auto;"_n
file write `book' "}"_n
file write `book' "</style>"_n
file write `book' "<body>"_n
if `"`title'"' != "" {
file write `book' "<h1>`title'</h1>"_n
}
else {
file write `book' "<h1>Codebook for `using'</h1>"_n
}
Descfile using "`using'", book(`book')
Descvars, book(`book')
Descvalues, book(`book') file(`using')
file write `book' "</body>"_n
file write `book' "</html>"_n
file close `book'
di as txt "Output written to " `"{browse "`saving'"}"'
restore
end
cd h:\ss18\stata_l2
htmlcodebook using arc06.dta, saving(test.html) replace
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------