Data

TERMS OF USE: Code is provided “as is” and without any warranty or guaranty of accuracy or support. If you use any of my code or data in your research, you agree to cite the respective paper. Please email me if you are interested in a specific piece of code that is not provided on the website.

All of my research is open access on my SSRN page

Causality Redux

Journal of Accounting & Economics, forthcoming (with C. Armstrong, J. Kepler, and D. Samuels)

Code to calculate dozens of common measures of discretionary accruals, real earnings management, management forecasts, voluntary 8-Ks, restatements, litigation, etc [Variable List] [Code].

Undisclosed SEC Investigations

Management Science, forthcoming (with T. Blackburne, J. Kepler, and P. Quinn)

Excel sheet on all closed SEC investigations from 2000 to 2016, listing investigation number, entity being investigated, SEC office conducting the investigation, and open and close dates of the investigation. Obtain and compiled from multiple distinct FOIAs, email me for data and terms of use.

Economics of Managerial Taxes and Corporate Risk-Taking

The Accounting Review, forthcoming (with C. Armstrong, S. Glaeser, and S. Huang)

Excel sheet on managerial tax rates used in the paper, by state and year [data]. ManagerRate is the highest combined federal and state statutory marginal income tax rate on wages, assuming the individual is in top brackets at both the federal and state levels, married filing jointly with $150,000 in deductible property taxes, and allowing for deductibility of state income taxes in states where applicable. By using this data, you agree to cite the paper in your manuscript and acknowledge the source of the data. Raw data were provided courtesy of NBER TaxSim. Updated TaxSim data can be found here.

Linguistic Complexity in Firm Disclosures: Obfuscation or Information

Journal of Accounting Research, Mar 2018 (with B. Bushee and I. Gow)

We have an updated SAS and CSV file with text processing of all StreetEvent conference call transcripts through 2015 (the paper used data through 2011). The updated datafile reflects changes and expansions in the StreetEvents data. For each of the three sections of the call, i.e. manager presentation, analyst questions, and manager response, we compute the following variables:

Fog: Fog_comp_pres, fog_anal_qa, fog_comp_qa
Number of Complex words: Num_complex_words_comp_pres, Num_complex_words_anal_qa, Num_complex_words_comp_qa
Number of Jargon words: Num_jargon_words_comp_pres, Num_ jargon _words_anal_qa, Num_ jargon_words_comp_qa
Number of words: Num_words_comp_pres, Num_words_anal_qa, Num_words_comp_qa
Number of sentences: Num_sentences_comp_pres, Num_sentences_anal_qa, Num_sentences_comp_qa
Proportion of forward looking sentences: Prop_fl_sents_comp_pres, Prop_fl_sents_anal_qa, Prop_fl_sents_comp_qa
Loughran-McDonald Litigious Word Count: litigious_comp_pres, litigious_anal_qa, litigious_comp_qa
Loughran-McDonald Modal Strong Word Count: modal_strong_comp_pres, modal_strong_anal_qa, modal_strong_comp_qa
Loughran-McDonald Modal Weak Word Count: modal_weak_comp_pres, modal_weak_anal_qa, modal_weak_comp_qa
Loughran-McDonald Negative Word Count: negative_comp_pres, negative_anal_qa, negative_comp_qa
Loughran-McDonald Positive Word Count: positive_comp_pres, positive_anal_qa, positive_comp_qa
Loughran-McDonald Uncertainty Word Count: uncertainty_comp_pres, uncertainty_anal_qa, uncertainty_comp_qa

Variable definitions appear in Appendix C and Table 4 of the paper. Fog calculations correct for the technical issue in the Lingua::En::Fathom Perl routine documented in Appendix B. Observations are indexed by the following firm-quarter identifiers: StreetEvents file name (file_name), CRSP security ID (Permno), and Compustat earnings announcement date (Rdq). Additional documentation on text parsing algorithms is available here. By using this data, you agree to cite the paper and acknowledge the source of the data. Please send me an email if you are interested in this data [data]

The Relation Between Equity Incentives and Misreporting: The Role of Risk-Taking Incentives

Journal of Financial Economics, Aug 2013 (with C. Armstrong, D. Larcker, and G. Ormazabal)

SAS code to calculate discretionary accruals [code]
SAS code to classify “intentional restatements” based on Audit Analytics data [code]

Corporate Governance and the Information Content of Insider Trades

Journal of Accounting Research, Dec 2011 (with A. Jagolinzer and D. Larcker)

SAS code to estimate trade-specific insider trading profits [code]

Correcting for Cross-Sectional and Time-Series Dependence in Accounting Research

The Accounting Review, March 2010 (with I. Gow and G. Ormazabal)

Bootstrapped standard errors. Methods with asymptotic foundations generally tend to perform poorly in small samples. A straightforward way to correct for this is to use bootstrapping. One can compute one-way or two-way cluster robust standard errors using cluster bootstrapping techniques. An advantage of cluster bootstrapping techniques is that they can be applied to regression commands that do not otherwise have a cluster option available. STATA code to calculate two-way cluster robust bootstrapped standard errors: OLS (REG), median regression (QREG), and robust regression (RREG). The greater then number of bootstrap iterations specified the longer this code will take to run.