Patent Data Page for Quid Pro Quo: Technology Capital Transfers for Market Access in China

Thomas J. Holmes, Ellen R. McGrattan, Edward C. Prescott

July 2013

 

This page provides the data files used for all the patent results presented in our paper.  It consists of files we constructed, including those linking China published patents for years 2005-2010 to U.S. and WIPO published applications.  It also includes our raw patent file data.  For the potential benefit of other researchers, we post data for all years we have so far collected, not just the sample years of our study.  For China published patents, the raw data posted is for years 1985-2010 (3.6 million published patents).  For U.S. published applications, the years are 2000-2012 (3.3 million published applications).  For WIPO published applications, the years are 1999-2012 (1.8 million published applications).  The data files are all provided in flat-ascii file format (encoding=UTF-8), compressed into zip files, as well as SAS format.  Each file has a corresponding html page with the file layout.  A number of the files are one gigabyte or more when uncompressed.

 

Contents

·        Patent Data Appendix This document explains our procedures and provides a general discussion of the data.

·        Programs for Table 1 SAS programs for Table 1.

·        Data Group 1 (Program Input) These are the files directly read by the above-mentioned programs, excluding the link files.

·        Data Group 2 (Link Files)  These files link China published patents 2005-2010 to U.S. and WIPO published applications.

·        Data Group 3(Raw Patent Files)  These are the raw files for China published patents, and U.S. and WIPO published applications.

·        Data Group 4 (Foreign Affiliate Sales Data)  Affiliate sales data for 2006 and 2007 provided by the China Ministry of Commerce, that we use to construct our sample of large foreign multinationals.

 

 

PROGRAMS FOR TABLE 1

Link to Program (in SAS language) and html Output File

Description

table1_foreign_multinational.sas, html

Constructs Table 1 statistics for foreign multinational patents (for all industries, and separately for the auto industry)

table1_Chinese_firms.sas, html

Constructs Table 1 statistics for Chinese firm patents

table1_Chinese_automakers.sas, html

Constructs Table 1 statistics for Chinese auto firm patents

 

 

DATA GROUP 1 (PROGRAM INPUT)

File

Description

Number of Observations

Link to Files

(Compressed)

Link to Contents/documentation

Multinat_patent

China patents for which at least one applicant is a large foreign multinational (see the separate web appendix)

329,256

ascii 4.3 MB

sas 4.9 MB

Multinat_patent.html

Chinese_firm_patent

China invention patents, 2005-2010, that we identify as having Chinese firm applicants.  To be included in this set, we require the patent be first filed in China, and eliminate all patents in the file multinat_patent.  For large patentors (top 100), we manually go through and eliminate any remaining foreign firms (including Taiwan firms), as well as universities, and applicants that are individuals rather than firms.

585,650

ascii 12 MB

sas 13.7 MP

Chinese_firm_patent.html

BYD_auto_patents

Auxiliary file used by the program Chinese_automakers.sas.  Takes account that the Chinese automaker BYD also has a cell phone battery line of business.

1,294

ascii 4 K

sas 4 K

BYD_auto_patents.html

 

 

 

DATA GROUP 2 (LINK FILES)

File

Description

Number of Observations

Link to Files

(Compressed)

Link to Contents/documentation

link_USapp_forpri_CNpub_forpri

Links foreign priority records for the US patent applications (file USapp_for_priority) to foreign priority records in the published China patents (CNpub_for_priority).  Typically, the country where foreign priority is being claimed is different from US or China.  (That is why it is showing up as a foreign priority claim in both the US and China patent application).

749,636

ascii 10 MB

sas 10.2 MB

link_USapp_forpri_CNpub_forpri.html

link_USapp_appnum_CNpub_forpri

Links records in China patents foreign priority file (CNpub_for_priority), claiming US foreign priority, to the corresponding applications in the US file (USapp_basedat)

286,158

ascii 1.4 MB

sas  1.5 MB

link_USapp_appnum_CNpub_forpri.html

link_USapp_forpri_CNpub_appnum

Links records in the US patent applications foreign priority file (USapp_for_priority), claiming CN foreign priority, to the corresponding published patents in China (CNpub_basedat)

33,569

ascii .4 MB

sas .4 MB

link_USapp_forpri_CNpub_appnum.html

link_WIPOapp_forpri_CNpub_forpri

Links records in WIPOapp_for_priority to records in CNpub_for_priority

399,066

ascii 4.6 MB

sas 4.7 MB

link_WIPOapp_forpri_CNpub_forpri.html

link_WIPOapp_forpri_CNpub_appnum

Links records in WIPOapp_for_priority, claiming CN foreign priority, to the corresponding published applications in China.

35,412

ascii .4 MB

sas .4 MB

link_WIPOapp_forpri_CNpub_appnum.html

 

 

DATA GROUP 3 (RAW PATENT FILES)

File

Description

Number of Observations

Link to Files

(Compressed)

Link to Contents/documentation

China Published Patents

Chinese published invention and utility patents for publication years 1985-2012 (1985 is first year of China’s patent system).  Patent records obtained from the State Intellectual Property Office of China (SIPO) from web searches at www.sipo.cn.  It is important to note that the patents reported in this data set have not necessarily yet been granted.  When we refer to these publications as “patents,” we are following the terminology of SIPO at its web site.  This data were also used in a study of Chinese inventors by Fang (2013)

 

CNpub_basedat

Contains one record for each published patent, with basic information about the patent.  Variable app_numfix is a unique index for each record and should be used for merging other patent information.

3,587,849

ascii 161 MB

sas  169 MB

CNpub_basedat.html

CNpub_inventor

List of inventors for each patent (one record per inventor).  Indexed by app_numfix*inventor_index

7,900,345

ascii 97 MB

sas 104 MB

CNpub_inventor.html

CNpub_for_priority

List of foreign priority claims for each patent (one record per foreign priority claim). Indexed by app_numfix*for_pri_index

1,080,629

ascii 21 MB

sas 23 MB

CNpub_for_priority.html

CNpub_ipc_class

List of IPC class indicators for each patent (one record per IPC class listed on patent).  Indexed by app_numfix*ipc_index

7,415,286

ascii 62 MB

sas 72 MB

CNpub_ipc_class.html

US Published Patent Applications

US patent applications, for publication years 2000 to 2012.  Files obtained by processing bulk text files in xml format posted at the Google Bulk Download patent page. 

 

USapp_basedat

Contains one record for each published patent application, with basic information about the patent.  Sorted by the two variables app_publication_num*app_publication_kind which together provide a unique index.

3,280,028

ascii 133 MB

sas 138 MB

USapp_basedat.html

USapp_inventor

List of inventors for each patent (one record per inventor).  Indexed by app_publication_num*app_publication_kind

*inventor_index

8,568,499

ascii 174 MB

sas 191 MB

USapp_inventor.html

USapp_assignee

List of assignees for each patent (one record per assignee).  Indexed by app_publication_num*app_publication_kind

*assignee_index

1,763,053

ascii 36 MB

sas 38 MB

USapp_assignee.html

USapp_for_priority

List of foreign priority claims for each patent (one record per foreign priority claim). Indexed by app_publication_num*app_publication_kind

*for_pri_index

1,748,151

ascii 19 MB

sas 22 MB

USapp_for_priority.html

USapp_provisional

For published application years 2005-2012, this file contains a list of provisional application numbers held by the patent before publication of the application.  Indexed by app_publication_num*app_publication_kind

*provisional_index

907,906

ascii 9 MB

sas 12 MB

USapp_provisional.html

WIPO Published Patent Applications

WIPO patent applications, for publication years 1999 to 2012.  Patent records obtained from WIPO, from web searches at  http://patentscope.wipo.int/

 

WIPOapp_basedat

Contains one record for each WIPO patent application, with basic information about the patent.  Variable Pub_no is the unique index.

1,798,289

ascii 61 MB

sas 138 MB

WIPOapp_basedat.html

WIPOapp_inventor

List of inventors for each patent (one record per inventor).  Indexed by Pub_no *inventor_index

4,828,548

ascii 74MB

sas 76 MB

WIPOapp_inventor.html

WIPOapp_applicant

List of applicants for each patent (one record per inventor).  Indexed by Pub_no *applicant_index

6,024,297

ascii 193MB

sas 209 MB

WIPOapp_applicant.html

WIPOapp_for_priority

List of foreign priority claims for each patent (one record per foreign priority claim). Indexed by Pub_no*for_pri_index

2,183,748

ascii 19 MB

sas 22 MB

WIPOapp_for_priority.html

WIPOapp_ipc_class

List of IPC class indicators for each patent (one record per IPC class listed on patent).  Indexed by Pub_no*ipc_index

4,669,915

ascii 30 MB

sas 36 MB

WIPOapp_ipc_class.html

 

 

DATA GROUP 4 (FOREIGN AFFILIATE SALES DATA)

File

Description

Number of Observations

Link to Files

(Compressed)

Link to Contents/documentation

Top500_foreign_affiliate_sales

Raw data from China’s Ministry of Commerce, downloaded at http://www.investinchina.gov.cn.  Google translate used to translate company names and addresses.

500

(each year)

Excel files

2006, 2007

 

affiliate_dat

This file contains the affiliates from 2006 and 2007. Each affiliate in the data is assigned to a foreign multinational (the variable multinat_text).  Affiliates listed in the raw table that are not actually foreign are excluded.

528

ascii 20 K

sas 25 K

affiliate_dat.html