ASREG : Rolling Window Regressions and Rolling Beta in Stata

To estimate rolling window regressions in Stata, the conventional method is to use the rolling command of Stata. However, that command is too slow, especially for larger data set. I recently posted asreg on the SSC. asreg is a Stata that fits a model of depvar on indepvars using linear regression in a user's defined rolling window or by a grouping variable. asreg is order of magnitude faster than estimating rolling window regressions through conventional methods such as Stata loops or using the Stata's official rolling command. asreg has the same speed efficiency as asrol. All the rolling window calculations, estimation of regression parameters, and writing the results to Stata variables are done in the Mata language.

To install asreg, type the following in the Stata command window:

ssc install asreg


        asreg depvar indepvars [if] [in] [, window([rangevar] # ) recursive minimum( # ) by(varlist) statistics_options]


Speed Optimization

    Rolling window calculations require lots of looping over observations. The problem is compounded by different data structures such as
    unbalanced panel data, data with many duplicates, and data with many missing values. Yet, there might be data sets that have both time series
    gaps as well as many duplicate observations across groups. asreg does not use a static code for all types of data structures. Instead, asreg
    intelligently identifies data structures and matches one of its rolling window routines with the data characteristics. Therefore, the rolling
    window regressions are fast even in larger data sets.

    asreg writes all regression ouputs to the data in memory as seperate variables. This eliminates the need for writing the results to a seperate
    file, and then merging them back to the data for any further calculations. New variables from the regression results follow the following

Naming New Variables

       observations              variable containing number of observation is named as obs_N
       regression slopes     a prefix of _b_ is added to the name of each independent variables
       constant                     variable containing constant of the regression is names as _b_cons
       r-squared                   r-squared and adj. r-squared are named as R2 and AdjR2 , respectively
       standard errors         a prefix of _se_ is added to the name of each interdependent variables
       residuals                    variable containing residuals is named as _residuals
       fitted                          variable containing fitted values is named as _fitted.


    asreg has the following options.

    1. window:

 specifies length of the rolling window.  The window option accepts up to two arguments.  If we have already declared our data as
    panel or time series data, asreg will automatically pick the time variable. In such cases, option window can have one argument, that is the
    length of the window, e.g., window(5).  If our data is not time series or panel, then we have to specify the time variable as a first argument
    of the option window. For example, if our time variable is year and we want a rolling window of 24, then option window will look like:
        window( year 24)

    2. recursive: 

The option recursive specifies that a recursive window be used. In time series analysis, a recursive window refers to a window
    where the starting period is held fixed, the ending period advances, and the window size grows (see for example, rolling). asreg allows a
    recursive window either by invoking the option recursive or setting the length of the window greater than or equal to the sample size per
    group.  For example, if sample size of our data set is 1000 observation per group, we can use a recursive analysis by setting the window
    length equal to 1000 or greater than 1000

    3. by:  

asreg is byable. Hence, it can be run on groups as specified by option by(varlist) or the bysort varlist: prefix.  An example of such
    regression might be  Fama and MacBeth (1973) second stage regression, which is estimated cross-sectionally in each time period. Therefore, the
    grouping variable in this case would be the time variable. Assume that we have our dependent variable named as stock_returns, independent
    variable as stock_betas, and time variable as month_id, then to estimate the cross-sectional regression for each month, asreg command will
    look like:

    . bys month_id: asreg stock_return stock_betas

    4. minimum: 

asreg estimates regressions where number of observations are greater than number of regressors.  However, there is a way to limit
    the regression estimates to a desired number of observations. The option minimum can be used for this purpose. If option min is used, asreg
    then finds the required number of observation for the regression estimated such that :
    obs = max(number of regressors (including the intercept), minimum observation as specified by the option min).
    For example, if we have 4 explanatory variables, then the number of regressors will be equal to 4 plus 1 i.e. 5.  Therefore, if asreg receives
    the the value of 8 from the option min, the required number of observations will be : max(5,8) = 8. If a specific rolling window does not have
    that many observations, values of the new variable will be replaced with missing values.


       fitted          reports residuals and fitted values for the last observation in the rolling window.  If option window is not specified, then
                         the residuals are calculated withing each group as specified by the option by(varlist) or the bysort varlist:
       serror       reports standard errors for each explanatory variable
       other          Most commonly used regression statistics such as number of observations, slope coefficients, r-squared, and adjusted r-squared
                        are written to new variables by default. Therefore, if these statistics are not needed, they can be dropped once asreg is


 Example 1: Regression for each company in a rolling window of 10 years

    . webuse grunfeld
    . bys company: asreg invest mvalue kstock, wind(year 10)
    The grunfeld data set is a panel data set, so we can omit the word year from the option window. Therefore, the command can also be estimated
        as shown below:
    . bys company: asreg invest mvalue kstock, wind(10)

   Example 2: Regression for each company in a recursive window

    . webuse grunfeld
    . bys company: asreg invest mvalue kstock, wind(year 10) rec

    . bys company: asreg invest mvalue kstock, wind(year 1000)


 Example 3: Using option minimum

    . webuse grunfeld
    . bys company: asreg invest mvalue kstock, wind(10) min(5)


 Example 4: Reporting standard errors 

    . webuse grunfeld
    . bys company: asreg invest mvalue kstock, wind(10) se

 Example 5: Reporting standard errors, fitted values and residuals 

    . webuse grunfeld
    . bys company: asreg invest mvalue kstock, wind(10) se fit


 Example 6: No window - by groups regressions 

    . webuse grunfeld
    . bys company: asreg invest mvalue kstock


 Example 7: Yearly cross-sectional regressions 

    . webuse grunfeld
    . bys year: asreg invest mvalue kstock


:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: *
*                                                                                                           *
*            Dr. Attaullah Shah                                                                  *
*            Institute of Management Sciences, Peshawar, Pakistan       *
*            Email:                            *
*            www.OpenDoors.Pk                                                               *

Also see

    astile, ascol, asrol, searchfor