The Alpha Architect white paper calls for the trading strategy to run on the universe of NYSE stocks, excluding financials, REITs, and ADRs. Thus our first step is to create universes that define these different groups of securities.
First, download a CSV of all NYSE securities from the securities master. We use fields="sharadar*"
to include all Sharadar master fields in the output. We use vendors="sharadar"
to limit to securities which are available from Sharadar.
from quantrocket.master import download_master_file
download_master_file("sharadar_nyse_securities.csv", exchanges="NYSE", fields="sharadar*", vendors="sharadar")
We can use the file to create the universe of all NYSE securities:
from quantrocket.master import create_universe
create_universe("nyse-stk", "sharadar_nyse_securities.csv")
{'code': 'nyse-stk', 'provided': 6777, 'inserted': 6777, 'total_after_insert': 6777}
Next we create a universe of financials. We'll exclude this universe (along with REITs and ADRs) when it comes time to run our backtest.
First load the securities into Pandas and list the sectors:
import pandas as pd
nyse_securities = pd.read_csv("sharadar_nyse_securities.csv")
nyse_securities.sharadar_Sector.unique()
array(['Financial Services', 'Real Estate', 'Utilities', nan, 'Industrials', 'Healthcare', 'Basic Materials', 'Consumer Cyclical', 'Energy', 'Communication Services', 'Consumer Defensive', 'Technology'], dtype=object)
In the Sharadar data, the financial sector is called "Financial Services". We filter the DataFrame to stocks in this sector, write them to a file (we use an in-memory file so as not to clutter the hard drive), and upload the file to create the universe of financial stocks:
nyse_securities[nyse_securities.sharadar_Sector == "Financial Services"].to_csv("sharadar_nyse_financials.csv")
create_universe("nyse-financials", "sharadar_nyse_financials.csv")
{'code': 'nyse-financials', 'provided': 872, 'inserted': 872, 'total_after_insert': 872}
Next we create a universe of REITs. From inspecting the master file we know that REITs are identified in the "sharadar_Industry" column:
nyse_securities[nyse_securities.sharadar_Industry.fillna("").str.contains("REIT")].to_csv("sharadar_nyse_reits.csv")
create_universe("nyse-reits", "sharadar_nyse_reits.csv")
{'code': 'nyse-reits', 'provided': 637, 'inserted': 637, 'total_after_insert': 637}
To create a universe of ADRs, we can take advantage of the "sharadar_Category" field in the Sharadar data, which contains this information. First have a peek:
nyse_securities.sharadar_Category.unique()
array(['Domestic Preferred', 'ETD', 'ADR Preferred', 'Domestic', nan, 'ETN', 'CEF', 'ETF', 'Domestic Primary', 'ADR', 'Canadian', 'Domestic Secondary', 'ADR Primary', 'ADR Secondary', 'Canadian Primary', 'Domestic Warrant', 'Canadian Preferred', 'ADR Warrant', 'Canadian Warrant'], dtype=object)
nyse_securities[nyse_securities.sharadar_Category.fillna("").str.startswith("ADR")][["sharadar_Ticker","sharadar_Name","sharadar_Category"]].head()
sharadar_Ticker | sharadar_Name | sharadar_Category | |
---|---|---|---|
6 | BCS-PD | Barclays Plc | ADR Preferred |
12 | HSEA | Hsbc Holdings Plc | ADR Preferred |
14 | BCS-PA | Barclays Plc | ADR Preferred |
25 | NBG-PA | National Bank Of Greece Sa | ADR Preferred |
26 | AHL-PA | Aspen Insurance Holdings Ltd | ADR Preferred |
Then create the ADR universe:
nyse_securities[nyse_securities.sharadar_Category.fillna("").str.startswith("ADR")].to_csv("sharadar_nyse_adrs.csv")
create_universe("nyse-adrs", "sharadar_nyse_adrs.csv")
{'code': 'nyse-adrs', 'provided': 656, 'inserted': 656, 'total_after_insert': 656}