Layout of ea_CPB07 file

(Note: there is a SAS version of the file and an ascii version, with different layouts)

Aug 1, 2012

 

Description of Supplementary Files for “An Alternative Theory of the Plant Size Distribution, with Geography and Intra- and International Trade”

by

Thomas J. Holmes (University of Minnesota, Federal Reserve Bank of Minneapolis, and NBER)

and

John J. Stevens (Board of Governors of Federal Reserve System)

 

Note: The statistics reported in this file are all derived from public Census data

 

Description of SAS File (ea_CBP07.sas7bdat)

ea×NAICS level data (177 economic areas times 172 diffuse demand NAICS industries).  For each industry and location the file contains the data needed to estimate the model for 2007. The “CBP” stands for County Business Patterns.  The source of this data is the publicly available County Business Patterns(CBP) data.

 

One thing to explain is that the CBP data has cell counts for each county×NAICS for the in size categories.  The CBP size categories are more disaggregated than the LM categories used for ea_CM97.  We aggregate the CBP to the level of the LM (this combines the 1000-1499 and 1500-2499 into 1000-2499, and the 2500-4999 and 5000 plus into 2500 plus). 

 

Thus we use the same size categories as described in ea_CM97.html.  We use the same values for AVGEMP and S_NORM.

 

Variable

Description

NAICS

6 digit NAICS

ea

Economic Area code used by BEA (1 to 179 basic)

ea_index

ea code used by us, numbered 1 to 177 (as we drop Alaska and Hawaii)

eatext

text description of economic area

emphatUS_cbp

Sum of emphat_cbp over all 177 EAs in contiguous U.S.

emphat_cbp

Take AVGEMP of each plant in EA for this industry and sum

est12_cbp

establishment counts in sizecat in {1,2} in EA

est13_cbp

establishment counts in sizecat in {1,2,3} in EA

est1_cbp

establishment counts in sizecat in {1} in EA

estUS_cbp

sum of est_cbp over all 177 EAs in contiguous U.S.

est_cbp

establishment counts across all plants in this NAICS in EA

naicsindex

index from 1 to 473 of the 473 manufacturing NAICS industries (sorted by NAICS)

naicstext

descriptive text of industry

pop2000

2000 population of EA

pop2007

2007 population of EA

pop2000_US

sum of pop2000 over 177 contiguous U.S. EAs

pop2007_US

sum of pop2007 over 177 contiguous U.S. EAs

sal97_scalefactor

salUS_geo/snormUS_LM

(see documentation for ea_CM97 for definitions)

salhat

=snorm_cbp*sal97_scalefactor

(i.e., use the 2007 plant counts, by set them to the 1997 s_norm levels and rescale to put it in 1997 terms.

salhat1

=snorm1_cbp*sal97_scalefactor

salhat12

=snorm12_cbp*sal97_scalefactor

salhat13

=snorm13_cbp*sal97_scalefactor

snorm12US_cbp

Sum of snorm12_LM across all EAs

snorm12_cbp

Take the sum of S_NORM across all plants for size_category in {1,2}

snorm13US_cbp

Sum of snorm13_LM across all EAs

snorm13_cbp

Take the sum of S_NORM across all plants for size_category in {1,2,3}

snorm1US_cbp

Sum of snorm1_LM across all EAs

snorm1_cbp

Take the sum of S_NORM across all plants for size_category in {1}

snormUS_cbp

Sum of snorm_LM across all EAs

snorm_cbp

Take the sum of S_NORM across all plants for all size categories

 

 

Description of ASCII File (ea_CM97.asc)

This is just like above, except in ASCII format for reading into Gauss.  In particular, there is no header row there 13 columns of data, where the columns are the following:

 

Column

Variable from Above

1

ea_index

2

ea

3

naicsindex

4

naics

5

emphat_CBP

6

salhat

7

salhat1

8

salhat12

9

salhat13

10

est_CBP

11

est1_CBP

12

est12_CBP

13

est13_CBP