August, 2012
Description of Supplementary
Files for “An Alternative Theory of the Plant Size Distribution, with Geography
and Intra- and International Trade”
by
Thomas J. Holmes (University of
Minnesota, Federal Reserve Bank of Minneapolis, and NBER, holmes@umn.edu)
and
John J. Stevens (Board of
Governors of Federal Reserve System)
Note: The statistics
reported in this paper that were derived from Census Bureau micro data were
screened to ensure that they do not disclose confidential information. The views expressed herein are those of the
authors and not necessarily those of the Federal Reserve Bank of Minneapolis,
the Federal Reserve Board, the Federal Reserve System, or the U.S. Bureau of
the Census.
FILE GROUP 1: Raw Data Files and Results of Stage 1
Estimation
File |
Description |
Link to Files |
Link to
Contents/documentation |
stage1_estimates_NAICS1997 stage1_estimates_SIC1992 |
Estimates of parameters of
distance adjustment from first stage estimation |
SAS Files CSV Files ascii file for input into Gauss program |
|
mandat_naics2 |
NAICS level data set for
the 473 NAICS manufacturing industries for 1997. Have various industry statistics, including
imports and plant counts. |
SAS File CSV File |
|
diffuse_naics |
NAICS level data set for
the 473 NAICS manufacturing industries for 1997. The key variable is "diffuse"
which equals 1 if classified as a diffuse demand industry as explained in the
text, and otherwise is 0. |
SAS File CSV File |
|
ea_pop_long_lat97 |
Economic Area (EA) Level information
for 177 EAs including population and geographic
coordinates |
SAS File ascii file for input into Gauss |
|
ea_pop97 |
EA population for 2007 |
SAS File ascii file for input into Gauss |
|
ea_CM97 |
ea×NAICS level data (177 ea times 172 diffuse demand NAICS
industries). For each industry and
location the file contains the data needed to estimate the model for 1997.
The “CM” stands for Census of Manufactures.
The source of this data is the publicly available Location of
Manufactures data |
SAS File ● ea_cm97.sas7bdat (31.2 MB file) ascii file for input into Gauss ● ea_CM97.asc (5.4 MB file) |
|
ea_CBP07 |
ea×NAICS level data (177 ea times 172 diffuse demand NAICS
industries). For each industry and
location the file contains the data needed to estimate the model for 2007.
The “CBP” stands for County Business Patterns. The source of this data is the publicly
available CBP data |
SAS File ● ea_cbp07.sas7bdat (25.8 MB file) ascii file for input into Gauss ● ea_cbp07.asc (5.9 MB file) |
|
sal97_scalefactor |
473×2 ascii
file, NAICS in first column, sal97_scalefactor in second, where
sal97_scalefactor=salUS_geo/snormUS_LM (see documentation for
ea_CM97 for definitions) |
ascii file |
|
ea_dist_within_tract |
For each ea contains
estimate of internal distance with in the ea, base on use of tract level
population data |
ascii file |
|
tradedat_forgauss |
Estimates of new China
share for each industry |
|
|
china_ea_con_share |
Estimate of share of china
manufacturing imports by ea for 2007 |
|
FILE GROUP 2: Programs to Run Stage 2 and to
Calculate Tables
File |
Description |
Link to Files |
Link to
Contents/documentation |
base97.prg |
Gauss program that runs stage
2 for 1997. It runs 10 iterations and
saves the results for each iteration.
For each iterations, it produces count coefficients λP and λS,
and estimate of the Γ=(γ1, γ2,.. γ177). It simulates the impact of a China Surge for
2007, using the 1997 estimates. We
note a change in notation. The count
coefficients referred to in the text as λP
and λS are referred to as nuT and nuN in the programs and
output. |
program ascii output files |
|
base07.prg |
Same as above, only
calculates the model for 2007, using the 2007 data |
ascii output files |
|
Table_6_7_process_model.sas |
SAS program to process the
results of the 1997 estimates. Creates
the tables used in the paper. note a
change in notation. Note again that
the count coefficients referred to in the text as λP
and λS are referred to as nuT and nuN in the programs and
output. |
Results |
|
Table_8_New_China_Share_ Descriptive_statistics |
Produces Table 8 |
SAS input file: program |
|
Table10_data_statistics |
Constructs Table 10, some
statistics from the data |
|
|
setup_mean_reversion_iter1 setup_mean_reversion_iter10 |
This program takes the gam estimate for 1997 and 2007, make an 11 point grid
(for gam=0 at bottom and then 10 categorices for ln(gam)). Then it
estimates a transition matrix for the gam, for che case
where new china share=0 (88 industries) Note first that we have
estimates of gam for the case where we have a
constraint that all sales are primary (this is iter=1
case). The other where we take out speciality segment sales.
(This is iter=10 case, assume convergence by
this point). When we do the
primary-only segment model case, we do iter=1. For full model, we do iter=10.
|
Takes as input files
mentioned above Programs: setup_mean_reversion_iter1.sas setup_mean_reversion_iter10.sas Outputs: Inputs for gauss program (original gam, rescaled, and fitted values, as well as values with
china share with no regression to the mean The transition matrix of
the 11 states., average in industris with New China
Share=0 The grid points for gam>0 (10 values) Basic information about
the industry Values for iter=10 |
|
Tab11_prediction_primary_only.prg Tab11_prediction_specialty.prg |
Set of programs for main
prediction exercise |
Programs Tab11_prediction_primary_only.prg Tab11_prediction_specialty.prg Summary Output Tab11_prediction_primary_only.out Tab11_prediction_specialty.out Results of individual
simulations Final wrap-up SAS program Tab11_prediction_last_step.sas Tab11_prediction_last_step.html |
|
Table12_1997_2007_sim.sas |
SAS program to compare
1997 and 2007 estimates of the primary count share. |
|