Patent Data Page for Quid Pro Quo: Technology Capital Transfers for Market Access in China
Thomas J. Holmes, Ellen R. McGrattan, Edward C. Prescott
July 2013
This page provides the data files used for all the patent results presented in our paper. It consists of files we constructed, including those linking China published patents for years 2005-2010 to U.S. and WIPO published applications. It also includes our raw patent file data. For the potential benefit of other researchers, we post data for all years we have so far collected, not just the sample years of our study. For China published patents, the raw data posted is for years 1985-2010 (3.6 million published patents). For U.S. published applications, the years are 2000-2012 (3.3 million published applications). For WIPO published applications, the years are 1999-2012 (1.8 million published applications). The data files are all provided in flat-ascii file format (encoding=UTF-8), compressed into zip files, as well as SAS format. Each file has a corresponding html page with the file layout. A number of the files are one gigabyte or more when uncompressed.
Contents
· Patent Data Appendix This document explains our procedures and provides a general discussion of the data.
· Programs for Table 1 SAS programs for Table 1.
· Data Group 1 (Program Input) These are the files directly read by the above-mentioned programs, excluding the link files.
· Data Group 2 (Link Files) These files link China published patents 2005-2010 to U.S. and WIPO published applications.
· Data Group 3(Raw Patent Files) These are the raw files for China published patents, and U.S. and WIPO published applications.
· Data Group 4 (Foreign Affiliate Sales Data) Affiliate sales data for 2006 and 2007 provided by the China Ministry of Commerce, that we use to construct our sample of large foreign multinationals.
Link to
Program (in SAS language) and html Output File |
Description |
Constructs Table 1 statistics for foreign multinational patents (for all industries, and separately for the auto industry) |
|
Constructs Table 1 statistics for Chinese firm patents |
|
Constructs Table 1 statistics for Chinese auto firm patents |
File |
Description |
Number of Observations |
Link to Files (Compressed) |
Link to Contents/documentation |
Multinat_patent |
China patents for which at least one applicant is a large foreign multinational (see the separate web appendix) |
329,256 |
ascii 4.3 MB sas 4.9 MB |
|
Chinese_firm_patent |
China invention patents, 2005-2010, that we identify as having Chinese firm applicants. To be included in this set, we require the patent be first filed in China, and eliminate all patents in the file multinat_patent. For large patentors (top 100), we manually go through and eliminate any remaining foreign firms (including Taiwan firms), as well as universities, and applicants that are individuals rather than firms. |
585,650 |
ascii 12 MB sas 13.7 MP |
|
BYD_auto_patents |
Auxiliary file used by the program Chinese_automakers.sas. Takes account that the Chinese automaker BYD also has a cell phone battery line of business. |
1,294 |
ascii 4 K sas 4 K |
File |
Description |
Number of Observations |
Link to Files (Compressed) |
Link to Contents/documentation |
link_USapp_forpri_CNpub_forpri |
Links foreign priority records for the US patent applications (file USapp_for_priority) to foreign priority records in the published China patents (CNpub_for_priority). Typically, the country where foreign priority is being claimed is different from US or China. (That is why it is showing up as a foreign priority claim in both the US and China patent application). |
749,636 |
ascii 10 MB sas 10.2 MB |
|
link_USapp_appnum_CNpub_forpri |
Links records in China patents foreign priority file (CNpub_for_priority), claiming US foreign priority, to the corresponding applications in the US file (USapp_basedat) |
286,158 |
ascii 1.4 MB sas 1.5 MB |
|
link_USapp_forpri_CNpub_appnum |
Links records in the US patent applications foreign priority file (USapp_for_priority), claiming CN foreign priority, to the corresponding published patents in China (CNpub_basedat) |
33,569 |
ascii .4 MB sas .4 MB |
|
link_WIPOapp_forpri_CNpub_forpri |
Links records in WIPOapp_for_priority to records in CNpub_for_priority. |
399,066 |
ascii 4.6 MB sas 4.7 MB |
|
link_WIPOapp_forpri_CNpub_appnum |
Links records in WIPOapp_for_priority, claiming CN foreign priority, to the corresponding published applications in China. |
35,412 |
ascii .4 MB sas .4 MB |
DATA GROUP 3 (RAW PATENT FILES)
File |
Description |
Number of Observations |
Link to Files (Compressed) |
Link to Contents/documentation |
China Published Patents |
Chinese published invention and utility patents for publication years 1985-2012 (1985 is first year of China’s patent system). Patent records obtained from the State Intellectual Property Office of China (SIPO) from web searches at www.sipo.cn. It is important to note that the patents reported in this data set have not necessarily yet been granted. When we refer to these publications as “patents,” we are following the terminology of SIPO at its web site. This data were also used in a study of Chinese inventors by Fang (2013) |
|
||
CNpub_basedat |
Contains one record for each published patent, with basic information about the patent. Variable app_numfix is a unique index for each record and should be used for merging other patent information. |
3,587,849 |
ascii 161 MB sas 169 MB |
|
CNpub_inventor |
List of inventors for each patent (one record per inventor). Indexed by app_numfix*inventor_index |
7,900,345 |
ascii 97 MB sas 104 MB |
|
CNpub_for_priority |
List of foreign priority claims for each patent (one record per foreign priority claim). Indexed by app_numfix*for_pri_index |
1,080,629 |
ascii 21 MB sas 23 MB |
|
CNpub_ipc_class |
List of IPC class indicators for each patent (one record per IPC class listed on patent). Indexed by app_numfix*ipc_index |
7,415,286 |
ascii 62 MB sas 72 MB |
|
US Published Patent Applications |
US patent applications, for publication years 2000 to 2012. Files obtained by processing bulk text files in xml format posted at the Google Bulk Download patent page. |
|
||
USapp_basedat |
Contains one record for each published patent application, with basic information about the patent. Sorted by the two variables app_publication_num*app_publication_kind which together provide a unique index. |
3,280,028 |
ascii 133 MB sas 138 MB |
|
USapp_inventor |
List of inventors for each patent (one record per inventor). Indexed by app_publication_num*app_publication_kind *inventor_index |
8,568,499 |
ascii 174 MB sas 191 MB |
|
USapp_assignee |
List of assignees for each patent (one record per assignee). Indexed by app_publication_num*app_publication_kind *assignee_index |
1,763,053 |
ascii 36 MB sas 38 MB |
|
USapp_for_priority |
List of foreign priority claims for each patent (one record per foreign priority claim). Indexed by app_publication_num*app_publication_kind *for_pri_index |
1,748,151 |
ascii 19 MB sas 22 MB |
|
USapp_provisional |
For published application years 2005-2012, this file contains a list of provisional application numbers held by the patent before publication of the application. Indexed by app_publication_num*app_publication_kind *provisional_index |
907,906 |
ascii 9 MB sas 12 MB |
|
WIPO Published Patent Applications |
WIPO patent applications, for publication years 1999 to 2012. Patent records obtained from WIPO, from web searches at http://patentscope.wipo.int/ |
|
||
WIPOapp_basedat |
Contains one record for each WIPO patent application, with basic information about the patent. Variable Pub_no is the unique index. |
1,798,289 |
ascii 61 MB sas 138 MB |
|
WIPOapp_inventor |
List of inventors for each patent (one record per inventor). Indexed by Pub_no *inventor_index |
4,828,548 |
ascii 74MB sas 76 MB |
|
WIPOapp_applicant |
List of applicants for each patent (one record per inventor). Indexed by Pub_no *applicant_index |
6,024,297 |
ascii 193MB sas 209 MB |
|
WIPOapp_for_priority |
List of foreign priority claims for each patent (one record per foreign priority claim). Indexed by Pub_no*for_pri_index |
2,183,748 |
ascii 19 MB sas 22 MB |
|
WIPOapp_ipc_class |
List of IPC class indicators for each patent (one record per IPC class listed on patent). Indexed by Pub_no*ipc_index |
4,669,915 |
ascii 30 MB sas 36 MB |
DATA GROUP 4 (FOREIGN AFFILIATE SALES DATA)
File |
Description |
Number of Observations |
Link to Files (Compressed) |
Link to Contents/documentation |
Top500_foreign_affiliate_sales |
Raw data from China’s Ministry of Commerce, downloaded at http://www.investinchina.gov.cn. Google translate used to translate company names and addresses. |
500 (each year) |
Excel files |
|
affiliate_dat |
This file contains the affiliates from 2006 and 2007. Each affiliate in the data is assigned to a foreign multinational (the variable multinat_text). Affiliates listed in the raw table that are not actually foreign are excluded. |
528 |
ascii 20 K sas 25 K |