Layout of stage1_estimates file (sas version and csv format)

 

stage1_estimates_naics1997  (for 466 6-digit NAICS industries for 1997)

stage1_estiamtes_sic1992  (for 457 NAICS industries for 1992)

stage1_estimatesNAICS97_for_gauss.asc (this is an ascii version of the above for input into Gauss)

 

August 1, 2012

 

Files used in paper "An Alternative Theory of the Plant Size Distribution, with Geography and Intra- and International Trade"

 

by

Thomas J. Holmes (University of Minnesota, Federal Reserve Bank of Minneapolis, and NBER)

and

John J. Stevens (Board of Governors of Federal Reserve System)

 

Note: The statistics reported in this paper derived from Census micro data were screened to ensure that they do not disclose confidential information.

 

Description of File:

These files contain the first stage estimates of the parameters of the distance adjustment function for each industry.  The cost efficiency parameter γi for each location i, i = 1,...,177, is not reported here.  (In other files, we report the γi that arise in the second stage iterative procedure.)

 

For the estimate, we define distbar and then used shipment observations with distance (in miles) greater than or equal to distbar.  For most cases we use distbar=100.  As explained in the paper, if, after exlcuding shipments below 100 miles, the implied value of a(100) in the semilog case turned out to satisfy a(100)≤.2, we reestimated the model with all the shipments and used this estimate instead.  For the five industries impacted this way, we constrained η=0 and just allowed for the linear term η. 

 

The scaling of eta1 and eta2 below take distance in 1,000s of miles.  If distance is in miles, the formula for the adjustment is

            a(distance)= exp(-.001*eta1*distance - .000001*eta2*distance2)

 

Layout of both files (see below for layout of stage1_estimatesNAICS97_for_gauss.asc)

 

Variable

Description

naicstext (or sictext)

text description of industry

naics (or sic)

industry code

obs_shipments_total

number of shipment observations for that industry, before truncating such that distance of the shipment exceed distbar.

obs_shipments_used

Number of shipments observations from the commodity flow survey used in the estimation.  (The number with distance>=distbar).

flag_spec2

= . means standard estimation worked fine with two parameters, eta1 and etat2.

 

 = 1 means 2 parameter trans cost function gave nonsense results so opted for 1 parameter function (i.e., constrained eta2=0).

 

 = 2 means standard error on eta2 was problematic so used 1 parameter function.

 

 = 3 we imposed eta2=0 after first running the estimates and finding a(100)<=.2, as discussed above.  We then set distbar=0 and estimated the model with eta2=0 and eta1 free.  This happens for 5 cases for both NAICS1997 and SIC1992

eta1 (η)

semilog coefficient on miles (in 1,000 mile units)

eta2 (η)

semilog coefficient on miles2 (in 1,000 mile units)

se_eta1

standard error of eta1 (semilog)

se_eta2

standard error of eta2 (semilog)

LogLike

log likelihood (semilog)

distbar

dist cutoff used for this analysis

eta_lnln

coefficient on lnln specification

se_eta_lnln

standard error of eta_lnln

LogLike_lnln

log likelihood

 

 

Layout of stage1_estimatesNAICS97_for_gauss.asc

Has no header row.

Column 1

naicsindex (this is an index from 1 to 473 of the 1997 NAICS manufacturing industries, with industries sorted by NAICS.  See file mandat_naics2 for a link from naicsindex to NAICS. 

Column 2

eta1

Column 3

eta2

Column 4

Loglike (semilog)

Column 5

eta_lnln

Column 6

Loglike (lnln)