Skip to content

Latest commit

 

History

History
184 lines (104 loc) · 14 KB

esplot.md

File metadata and controls

184 lines (104 loc) · 14 KB

version 0.10.2

esplot {hline 2} event study plots

Syntax

esplot varname [if] [in] [weight], event(varname [, suboptions]) [options ]

or

esplot outcome [if] [in] [weight], event(varname [, suboptions]) [options ]

Documentation

Core Syntax

esplot can be called one of two ways.

  1. with an event-time variable:

    {cmd:. esplot <event_time> [, options]}

  2. or with an event indicator (on panel data):

    {cmd: esplot , event(<event_indicator> [, options])}

Options

General Options

compare(<event_indicator> [, options]) Only available when using the event(<event_indicator>) syntax. Plot the difference between the event-time coefficents associated with the event in event and the event given in compare. For example, esplot infected, event(treatment) compare(placebo) will estimate event-time coefficents around the treatment event and the placebo event and will plot the difference between the treatment and placebo arms.

event_indicator suboptions (for compare and event)

save this causes the vector of relative-time indicators created by esplot around the event given in either event or compare to be saved to memory. This option is useful when running many specifications of the same event study, as it can save time when used with nogen by creating this vector only once.

nogen this vector tells esplot that the vector of relative-time indicators around this particular event already exist (probably after being created by an earlier call to esplot with the save option.)

replace allows esplot to write over the existing vector of relative-time indicators (rarely used.)

window(start end [, options]) display dynamic effect estimates (event-time coefficents) ranging from start to end. start should be less than zero; end should be greater than zero.

window recognizes four suboptions that control how endpoints (i.e. periods outside the window) should be treated. By default, esplot will fully saturate the model with relative time indicators for every possible period, except for the omitted period (t = -1). The bin, bin_pre, and bin_post cause endpoints to be binned; see below for more information.

endpoint suboptions (for window)

saturate default option, equivalent to typing nothing. esplot will find the maximum and minimum relative time periods supported in the data (i.e. the last period in the data minus the earliest event, and the first period in the data minus the latest event.) Then esplot will fully saturate the model will all possible relative time periods. Some of these coefficients may not be well identified (some may even drop out).

bin Define an indicator for j < start and an indicator for j > end, where j is relative time. Rather than including all possible event-time indicators, we "bin" all event-time indicators before/after the window starts/ends. Thus, rather than estimating the full set of dynamic effects, we estimate dynamic effects only within the specified window, and estimate (but do not plot) constant long-run effects before and after the window.

bin_pre Define an indicator for j < start, but use all possible post-event relative time indicators for estimation.

bin_post Define an indicator for j > end, but use all possible pre-event relative time indicators for estimation.

by(varname) estimate coefficents seperately for each level of by. For example, esplot wage years_since_policy, by(education) will estimate the event-time coefficients for the relative time given in years_since_policy seperately for each level of education and plot as many series as there are levels of education.

difference estimate coefficents relative to the base-level of by. For example, if education has k levels, then typing esplot wage years_since_policy, by(education) will estimate the event-time coefficients for the relative time given in years_since_policy seperately for each level of education and plot the difference between each of the k-1 sets of event-time coefficients and the base level of the by variable.

Using difference in combination with compare allows for the estimation of difference-in-difference-in-difference coefficients.

estimate_reference by default, esplot includes indicator variables for all relative time periods except for -1. If the estimate_reference option is specified, the indicator for -1 is included and explicitly differenced out of the rest of the coefficients. When used with by, it recenters each series indepently, so that each series is mechanically 0 at time -1.

savedata(filename [,replace]) in addition to plotting directly, esplot will save the estimated coefficients to filename. This allows for the greatest flexibility in plotting the estimates. Coefficents are saved after applying all operations, like differencing (difference or compare), or pooling (period_length). Can be abbreviated save(...).

save_sample(varname) store the output of e(sample) in varname following the internal regression call.

Regression Options

controls(varlist) additional control variables to be included in the internal regression call.

absorb(varlist) a vector of fixed effects to absorbed and not estimated in the internal regression call. help reghdfe##absvar for more information.

vce(vcetype, subopt)) specify the types of standard errors computed. help reghdfe##opt_vce for more information. Not compatible with quantile.

quantile(0 < k < 100) if this option is specified, esplot will use a quantile regression, rather than OLS. quantile(50) and quantile(.5) are synonyms, and will cause esplot to estimate a median regression.

weights are allowed when using OLS (default), but not when quantile is specified.

Display Options

window(start end) display dynamic effect estimates (event-time coefficents) ranging from start to end. start should be less than zero; end should be greater than zero.

period_length(integer) pool dynamic effect coefficients in groups of period_length before plotting.

colors(colorstylelist) ordered list of colors; used for point estimates and confidence intervals.

Additional twoway options can be specified and will be passed through to the internal twoway call. See {cmd:help twoway_options}.

More complicated options are discussed below.

Relative Event-Study Coefficients with difference and compare

esplot has two suboptions to estimate and "difference out" reference coefficients; difference and compare:

difference plots all series relative to the base level of by. It is helpful here to consider an example; let by be a dummy variable that is 1 when the individual is a female, 0 when it is a male. By default, passing this variable to by will cause two series to be estimated: one set of coefficients for males, and one for females. However, perhaps we are mainly interested in the difference in response across genders. Then, we could select the difference option. esplot will then estimate the male and female coefficients, but will plot their difference (female - male) in every period.

NB: when a using a factor variable with more than two levels with by, stata treats the lowest value as the base case. At this time, to change the reference category, it is neccessary to create a new variable where the desired reference category is the lowest value. In the above example, we would use an indicator variable for "is male" rather than one for "is female".

compare takes an additional event dummy, and estimates the main event coefficients relative to this event. Here, it is also helpful to consider an example from Cullen & Perez-Truglia, 2019. Cullen & Perez-Truglia use the quasi-random rotation of managers across units to identify the effects of manager gender on the career progression of male and female employees. For example, they consider the effect of switching from a female manager to male manager relative to switching from a female manager to another female manager. This would be coded as ... event(to_male_manager) compare(to_female_manager).... By including the comparison event, the authors adjust for the effects of switching managers per se and isolate the differences associated with the gender of the manager.

compare and difference can be used together. See Cullen & Perez-Truglia, 2019 for an in depth discussion, examples, and econometric specification

Efficiently Estimating Many Event-Study Plots: save, replace, nogen

event and compare have the sub-options save, nogen, and replace, which are of primary use when estimating multiple specifications, or multiple outcomes. These options save, (and then subsequently read from the data in memory), event "lags and leads". The replace example is provided for completeness, but should be used with caution, as it overwrites "lags and leads" saved in memory. By default, esplot does not change the data in memory.

save saves event lags L_ (or L) and leads F (or F_). This sub-option can be selected for either, or both event and compare.

nogen tells esplot that lags and leads of the above form are present in the data in memory (often as a result of selecting save on an earlier run) and that it should use the lags and leads in memory. If lags and leads are present and nogen (or replace ) is not selected, esplot will throw an error.

replace tells esplot that lags and leads are present in the data and that it should overwrite them. This option should be used with caution, especially when lags and leads are user defined. There are two primary use-cases for replace, most often used with save.

  • if an earlier esplot call used save, and the window is adjusted. This is because esplot calculates lags and leads only up to the endpoints given in window. (Example 2)
  • if an earlier esplot call used save, and you now wish to use estimate_reference (or vice versa). Since, esplot only keeps the lags and leads that it needs, if save is used without estimate_reference, then the necessary leads for the omitted periods will not be saved. (Example 3)

Further reading on binned v.s. full saturated models

There is a very active applied econometric literature concerning the correct specification of event-study estimates.

Baker, Larcker, & Wang, 2021 show that binned and saturated models can lead to substantively different estimates, especially in the presence of pre-trends. esplot therefore makes both options available to users.

It uses the fully saturated model as the default since this enforces the least structure on the research design. Borusyak & Jaravel, 2018 argue that the fully saturated model is most robust to long run pre- and post- trends, since it does not impose a parametric assumption on dynamic effects before/after a given period.

Researchers are then, of course, free to impose that structure as a design choice with any of the three variants of the window sub-options. Schmidheiny & Siegloch, 2020 show that imposing the structure implied by binning (i.e. that effects are constant before/after some periods) can improve identification of time fixed effects.

Examples with save, replace, and nogen

Example 1
/* event lags and leads are saved /
{cmd:. esplot paygrade, by(male) event(to_male_mgr, save) window(-20 30) estimate_reference}
/
esplot saves time by simply using the lags/leads from the previous call */
{cmd:. esplot ln_sal, by(male) event(to_male_mgr, nogen) window(-20 30) period_length(3)}

Example 2
/* event lags and leads are saved /
{cmd:. esplot paygrade, by(male) event(to_male_mgr, save) window(-20 30) period(3) estimate_reference}
/
esplot saves time by simply using the lags/leads from the previous call */
{cmd:. esplot ln_sal, by(male) event(to_male_mgr, nogen) window(-20 30) period_length(3)}
/*we wish to expand the window of the first plot. we tell esplot that it will find lags and leads in memory, but it can ignore and overwrite them */
{cmd:. esplot paygrade, by(male) event(to_male_mgr, replace) window(-40 60) period(6) estimate_reference}

Example 3
/* event lags and leads are saved /
{cmd:. esplot paygrade, by(male) event(to_male_mgr, save) window(-20 30) period(3)}
/
esplot saves time by simply using the lags/leads from the previous call /
{cmd:. esplot ln_sal, by(male) event(to_male_mgr, nogen) window(-20 30) period_length(3)}
/
There are differences in levels between males and females; we would like both series to go through the origin at t = -1. we now want to use estimate_reference, so we tell esplot that it will find lags and leads in memory, but it can ignore and overwrite them */
{cmd:. esplot paygrade, by(male) event(to_male_mgr, save) window(-20 30) period(3) estimate_reference}

Remarks

See website for further discussion and for examples.

Acknowledgements

Katherine Fang and Jenna Anders made extensive contributions to early versions of the underlying code, which this package extends. Any remaining errors are mine.

Author

Dylan Balla-Elliott
Research Associate, Harvard Business School
dballaelliott@gmail.com
github | twitter

Additional Features

Bug-fixes, feature requests, and general comments are welcome via email, or directly as issues on github.

I currently plan on adding support for :
- additional plot options

Extensions via forks/pull requests by github users are welcomed.


This help file was dynamically produced by MarkDoc Literate Programming package