Layout of ea_CM97 file
(Note: there is a SAS
version of the file and an ascii version, with
different layouts)
Aug 1, 2012
Description of Supplementary
Files for “An Alternative Theory of the Plant Size Distribution, with Geography
and Intra- and International Trade”
by
Thomas J. Holmes (University of
Minnesota, Federal Reserve Bank of Minneapolis, and NBER)
and
John J. Stevens (Board of
Governors of Federal Reserve System)
Note: The statistics reported
in this file are all derived from public Census data
Description of SAS File (ea_CM97.sas7bdat)
ea×NAICS level data (177 economic areas times 172 diffuse
demand NAICS industries). For each
industry and location the file contains the data needed to estimate the model
for 1997. The “CM” stands for Census of Manufactures. The source of this data is the publicly
available Location of Manufactures (LM) data. (This is file E9731e2 from
the 1997 Economic Census CD
(U.S. Bureau of Census (2001)).
One thing to explain is that
the LM data has cell counts for each county×NAICS for
the following size catagories. The table also reports the variables avgemp and s_norm for each size
class. The construction of avgemp is described in the table. The construction of s_norm is described in the appendix to the paper.
Employment Size Category
Number |
Employment Range |
AVGEMP (Average Employment in
Size Class (Across all Manufacturing Plants in 1997 CM) |
S_NORM (Norm sales in size cat, with s_norm=1
for smallest category) |
1 |
1-4 |
2.1 |
1.0 |
2 |
5-9 |
6.7 |
2.3 |
3 |
10-19 |
13.7 |
4.8 |
4 |
20-49 |
31.3 |
12.1 |
5 |
50-99 |
70.1 |
30.3 |
6 |
100-249 |
154.1 |
76.8 |
7 |
250-499 |
345.9 |
194.1 |
8 |
500-999 |
679 |
391.6 |
9 |
1000-2499 |
1448.1 |
928.9 |
10 |
2500 and above |
4795.3 |
3050.3 |
Variable |
Description |
NAICS |
6 digit NAICS |
ea |
Economic Area code used by
BEA (1 to 179 basic) |
ea_index |
ea code used by us, numbered
1 to 177 (as we drop Alaska and Hawaii) |
eatext |
text description of
economic area |
emphatUS_LM |
Sum of emphat_LM
across all 177 contiguous EAs |
emphat_LM |
This is estimated employment
in the ea for the 1997 CM, using the LM plant counts and weighting them by
the variable AVGEMP defined above. |
est12_LM |
Plant counts in size
category 1 and 2 combined. |
est13_LM |
Plant counts in size
category 1, 2, and 3 combined |
est1_LM |
Plant counts in size
category 1 |
estUS_LM |
Plant counts summed over
all the 177 contiguous EAs in the United States in
the 1997 LM |
est_LM |
Count of plants in the EA,
in the 1997 Location of Manufacturing plants |
naicsindex |
index from 1 to 473 of the
473 manufacturing NAICS industries (sorted by NAICS) |
naicstext |
descriptive text of
industry |
pop97 |
1997 population of EA |
pop97_US |
sum of population across
177 EAs (this is U.S. population less Alaska and
Hawaii) |
salUS_geo |
total US sales revenue of
industry in 1997 CM published tabulations |
salhat |
=(snorm_LM/snormUS_LM)*salUS_geo; |
salhat1 |
=(snorm1_LM/snormUS_LM)*salUS_geo; |
salhat12 |
=(snorm12_LM/snormUS_LM)*salUS_geo; |
salhat13 |
=(snorm13_LM/snormUS_LM)*salUS_geo; |
snorm12US_LM |
Sum of snorm12_LM across
all EAs |
snorm12_LM |
Take the sum of S_NORM
across all plants for size_category in {1,2} |
snorm13US_LM |
Sum of snorm13_EA across
all eas. |
snorm13_LM |
Take the sum of S_NORM
across all plants for size_category in {1,2,3} |
snorm1_LM |
Take the sum of S_NORM
across all plants for size_category in {1} |
snorm1US_LM |
Sum of snorm1_LM across
all EAs |
snormUS_LM |
Sum of snorm_LM
across all EAs |
snorm_LM |
Sum of S_NORM across all
plants in EA across all size classes |
Description of ASCII File (ea_CM97.asc)
This is just like above,
except in ASCII format for reading into Gauss.
In particular, there is no header row there 13 columns of data, where
the columns are the following:
Column |
Variable from Above |
1 |
ea_index |
2 |
ea |
3 |
naicsindex |
4 |
naics |
5 |
emphat_LM |
6 |
salhat |
7 |
salhat1 |
8 |
salhat12 |
9 |
salhat13 |
10 |
est_LM |
11 |
est1_LM |
12 |
est12_LM |
13 |
est13_LM |