Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relocating shrcd and exchcd filters from SignalMasterTable.do to SignalDoc.csv #133

Open
chenandrewy opened this issue Aug 28, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@chenandrewy
Copy link
Collaborator

chenandrewy commented Aug 28, 2023

Many thanks to @junkyungauh for pointing this out.

SignalMasterTable.do has:

image

These filters should not be applied until the portfolios code, via SignalDoc.csv's "Filter" column. As described in the paper, we try to put off filtering until the portfolio generation step so that users of the data have the most flexibility.

Should this standard filter be applied everywhere in the portfolio generation step? I'm not sure of the answer to this. We should at least review a few of the original papers before we decide.

Currently, we're inconsistently applying these filters because we sometimes use SignalMasterTable.dta as the "backbone" of the signal (e.g. Mom6m.do), and other times use dailyCRSP.dta (MaxRet.do) or some other basic dataset. As a result, Mom6m will have more missing values than MaxRet.do.

I don't think this change will have a huge effect. Most of stocks with weird exchcd-shrcd combinations are missing data for everything but historical market prices. For example, about 80% of these weird stocks are missing ceq:

image

@chenandrewy
Copy link
Collaborator Author

Some more info on the odd exchcds and shrcds (which might make it painfully obvious that I don't study ETFs). Here is the share of CRSP permnos with odd codes over time:

image

Odd codes were rare until the 90s, but now account for 45% of permnos! The bulk of the odd codes are ETFs. Others are ADRs and SBIs, Thankfully, the "when-issued trading" codes for the standard exchanges (31-33) are rare.

Given how these odd codes do not correspond to "stocks" in the way most asset pricing people think of "stocks," we should probably apply the standard filter by default in the portfolios code. Not sure about keeping when-issued-trading, but thankfully that's rare.

@tomz23 tomz23 added enhancement New feature or request help wanted Extra attention is needed labels Aug 5, 2024
@chenandrewy
Copy link
Collaborator Author

Here's a review of how Jegadeesh and Titman 1993; Ang, Hodrick, Xing, Zhang 2006; and Hou and Moskowitz 2005 handle it. These are all papers that use only price data and span distinct types of predictors, as well as distinct teams.

Jegadeesh and Titman seem to only mention these codes in passing. "Our analysis of NYSE and AMEX stocks documents significant profits in the 1965 to 1989 sample period." This is consistent with imposing the standard code screens.

AHXZ, for the VIX beta portfolios, say they "run the regression for all stocks on AMEX, NASDAQ, and the NYSE." The idiovol ports they don't say this, but it seems implicit based on their other discussions of excluding AMEX and NASDAQ.

For Hou and Moskowitz: "From 1963 to 1973, the CRSP sample includes NYSE and AMEX firms only, and post-1973 NASDAQ firms are added to the sample."

Bali, Engle, and Murray's textbook also focuses on standard codes: "The sample used in Part II of this book as well as in a large number of empirical asset pricing studies is a monthly sample that contains all U.S.-based common stocks in the CRSP database. Therefore, for each month t, the sample is constructed by taking all U.S.-based common stocks in the CRSP database as of the end of the given month.... ....U.S.-based common stocks are identified as the subset of these securities that have a share code (SHRCD field in the msenames file) value of either 10 or 11. We refer to this sample as the CRSP U.S.-based common stock sample, or simply the CRSP sample."

Long story short, think imposing standard codes everywhere will have very little effect on replications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants