Clustered Standard Error Sas
Contents |
or clustered standard errors, or Fama-Macbeth regressions in SAS. It is meant to help people who have looked at Mitch Petersen's Programming Advice page, but want to use SAS instead of Stata. Mitch has posted results sas fixed effects clustered standard errors using a test data set that you can use to compare the output below
Cluster Robust Standard Errors Sas
to see how well they agree. You can generate the test data set in SAS format using this code. SAS now reports
Standard Error Sas Proc Means
heteroscedasticity-consistent standard errors and t-statistics with the hcc option: proc reg data=ds; model y=x / hcc; run; quit; You can use the option acov instead of hcc if you want to see the covariance matrix
Standard Deviation Sas
of the standard errors. Thanks to Guan Yang at NYU for making me aware of this. Until version 9.2, you had to use ODS to capture these statistics, which always seemed silly to me. SAS finally caught up though. A regression with fixed effects using the absorption technique can be done as follows. (Note that, unlike with Stata, we need to supress the intercept to avoid a dummy variable trap.) proc glm; confidence interval sas absorb identifier; model depvar = indvars / solution noint; run; quit; Absorption is computationally fast, but the individual fixed effects estimates will not be displayed. If you want to see the fixed effects estimates, use: proc glm; class identifier; model depvar = indvars identifier / solution; run; quit; This will automatically generate a set of dummy variables for each level of the variable "identifier". Clustered standard errors may be estimated as follows: proc genmod; class identifier; model depvar = indvars; repeated subject=identifier / type=ind; run; quit; This method is quite general, and allows alternative regression specifications using different link functions. The online SAS documentation for the genmod procedure provides detail. Alternatively, you may use surveyreg to do clustering: proc surveyreg data=ds; cluster culster_variable; model depvar = indvars; run; quit; Note that genmod does not report finite-sample adjusted statistics, so to make the results between these two methods consistent, you need to multiply the genmod results by (N-1)/(N-k)*M/(M-1) where N=number of observations, M=number of clusters, and k=number of regressors. More detail is provided here. Clustering in two dimensions can be done using the method described by Thompson (2011) and others. SAS code to do this is here and here. Running a Fama-Macbeth regression in SAS is quite easy, and doesn't require any sp
and Wretman (1992, p. 652). A total of 284 Swedish municipalities are grouped into 50 clusters of neighboring municipalities. Five clusters with a total of 32 municipalities are randomly selected. The results from the regression analysis in variance sas which clusters are used in the sample design are compared to the results of a t test sas regression analysis that ignores the clusters. The linear relationship between the population in 1975 and in 1985 is investigated. The 32 selected municipalities coefficient of variation sas in the sample are saved in the data set Municipalities: data Municipalities; input Municipality Cluster Population85 Population75; datalines; 205 37 5 5 206 37 11 11 207 37 13 13 208 37 8 8 209 37 17 https://kelley.iu.edu/nstoffma/fe.html 19 6 2 16 15 7 2 70 62 8 2 66 54 9 2 12 12 10 2 60 50 94 17 7 7 95 17 16 16 96 17 13 11 97 17 12 11 98 17 70 67 99 17 20 20 100 17 31 28 101 17 49 48 276 50 6 7 277 50 9 10 278 50 24 26 279 50 10 9 280 50 67 64 281 https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/statug_surveyreg_a0000000309.htm 50 39 35 282 50 29 27 283 50 10 9 284 50 27 31 52 10 7 6 53 10 9 8 54 10 28 27 55 10 12 11 56 10 107 108 ; The variable Municipality identifies the municipalities in the sample; the variable Cluster indicates the cluster to which a municipality belongs; and the variables Population85 and Population75 contain the municipality populations in 1985 and in 1975 (in thousands), respectively. A regression analysis is performed by PROC SURVEYREG with a CLUSTER statement: title1 'Regression Analysis for Swedish Municipalities'; title2 'Cluster Sampling'; proc surveyreg data=Municipalities total=50; cluster Cluster; model Population85=Population75; run; The TOTAL=50 option specifies the total number of clusters in the sampling frame. Output 88.2.1 displays the data and design summary. Since the sample design includes clusters, the procedure displays the total number of clusters in the sample in the "Design Summary" table. Output 88.2.1 Regression Analysis for Cluster Sampling Regression Analysis for Swedish Municipalities Cluster Sampling The SURVEYREG Procedure Regression Analysis for Dependent Variable Population85 Data Summary Number of Observations 32 Mean of Population85 27.50000 Sum of Population85 880.00000 Design Summary Number of Clusters 5 Output 88.2.2 displays the fit statistics and regression coefficient estimates. In the "Estimated Regression Coefficients" table, the estimated slope for the linear relationship is 1.05, which is significant a
activity Mark (Shuai) Ma > Two-Way Cluster-Robust Standard Errors and SAS code This page includes 1) A SAS code (Download) that you can use to calculate finite-sample estimates of standard https://sites.google.com/site/markshuaima/home/two-way-clustered-standard-errors-and-sas-code errors robust to two-way clustering for OLS regressions. My code allows you to obtain two-way clustered standard error based on the formula in Petersen (2008), Cameron and Miller(2011) http://www.wrds.us/index.php/repository/view/36 and Thompson (2011). If you use this code, please add a footnote "To obtain unbiased estimates in finite samples,the clustered standard errors are adjusted by (N-1)/(N-P)× G/(G-1), where standard error N is the sample size, P is the number of independent variables, and G is the number of clusters." For details, please see my note.To obtain the test results, you need to run the macro code first, and then you run the command " %REG2DSE(y=DV, x=INDV, firm=firmid, time=timeid, multi=0, dataset=A.data, output=A.results); ". See the code for details. 2) clustered standard error A research note (Download) on finite sample estimates of two-way cluster-robust standard errors. The note explains the estimates you can get from SAS and STATA. Petersen (2009) and Thompson (2011) provide formulas for asymptotic estimate of two-way cluster-robust standard errors. But, to obtain unbiased estimated, two-way clustered standard errors need to be adjusted in finite samples (Cameron and Miller 2011). Finite sample estimates of two-way cluster-robust standard errors could possibly result in very different significance levels than do the unadjusted asymptotic estimates. However, researchers rarely explain which estimate of two-way clustered standard errors they use, though they may all call their standard errors “two-way clustered standard errors”. My note explains the finite sample adjustment provided in SAS and STATA and discussed several common mistakes a user can easily make.and3) Answers to a few questions I have received about Cluster-Robust Standard Errors .Q i) How to obtain R-square using your SAS code? A: I have updated my code. The current July 2014 version could automatically report r-square in the output.Q ii) How
standard errors. The macro allows to have a single observation for each firm-period (e.g. firmyear) as well as multiple observations (e.g. four quarters in a firmyear). See also author's home page. /*April ,2014*/ /*This sas macro code is modified by Mark (Shuai) Ma based on the two-way clustered SE code from Professor John McInnis :*/ /*According to Petersen (2008) and Thompson (2011),*/ /*there are three steps to estimate two-way clustered SEs: */ /*1. estimate firm-clustered VARIANCE-COVARIANCE matrix V firm,*/ /*2. estimate time-clustered VARIANCE-COVARIANCE matrix V time,*/ /*3. estimate heteroskedasticity robust white VARIANCE-COVARIANCE matrix (V white) when there is only one observations each firm-time intersection,*/ /*or, estimate firm-time intersection clustered VARIANCE-COVARIANCE matrix (V firm-time) when there is more than one observations each firm-time intersection,*/ /*This code allows the user to closely follows the formula given by Petersen (2008) and Thompson (2011).*/ /*If you use this code, please add a footnote:*/ /*To obtain unbiased estimates in finite samples,the clustered standard error is adjusted by (N-1)/(N-P)× G/(G-1), */ /*where N is the sample size, P is the number of independent variables, and G is the number of clusters. */ /*For details, please see my note on two-way clustered standard errors avaiable on SSRN and */ /*my websitehttps://sites.google.com/site/markshuaima/home.*/ /*Lastly, I post this code for the communication purpose without */ /*any warranty or guaranty of accuracy or support. */ /*I tried my best to ensure the accuracy of the codes, */ /*but I could not exclude the possibility that there might still be errors. If any error is found, please get me know immediately.*/ /*********************************************************************/ /*input explanations */ /*you