File Name: link_USapp_forpri_CNpub_forpri

Observations: 749,636

 

 

Description: This file determines links between file USapp_for_priority  and file CNpub_for_priority.  That is, we take the foreign priority claims in the US applications, which list a claim date, a claim country, and a patent application number, and we take the foreign priority claims in the Chinese published patents, which also list a date, country, and an application number. 

                         We first attempt to match on (1) priority date, (2) country (3) application number (after we do some cleaning of the text field for application number for both countries).  In cases where we did not get a match, we also look at the title of the patent and the list of inventors.  Specifically, we also included cases where there is a match on (1) priority date, (2) country, (3) title (first 40 characters match except for minor discrepancies, and (4) name of the first inventor.

                         Note a given Chinese patent might match to multiple to US applications, if the Chinese patent and multiple US applications cite the same foreign priority claim (or overlapping foreign priority claims).  Analogously, a given US application might link to multiple Chinese patents.

                         The table at the bottom of this file shows that for Chinese patents claiming foreign priority (outside the US) in publication years 2005 and after, we obtain a match to a patent application in the U.S. file, with the same foreign priority claim, in 80 percent of cases.  In the paper we focus only on Chinese patents published 2005 and after.  Not every patent filed in China claiming foreign priority outside the U.S. will be filed in the US as well.  So we expect the match rate to be less than 100 percent, even apart from measurement error where we miss true matches.

                         Note the match rates in the early 2000s are low, beginning with 10 percent in 2000.  One issue is that the USPTO only began publishing patents in 2000, and in the early years, not every patent application was published. 

 

Variables

 

Variable

Type

Len

Columns in Ascii File

Description

app_numfix

Char

12

1-12

link variable to Chinese patent (unique index of CNpub_basedat)

Country2

Char

2

13-14

Country code of foreign priority claim

for_pri_rec_trun

Char

40

15-54

truncated record of foreign priority claim in Chinese patent data used for matcing

for_pri_index_CN

Num

8

55-62

Corresponds to for_pri_index in file CNpub_for_priority

app_publication_num

Char

20

63-82

link to US application (along with app_publication_kind).  app_publication_num*application_kind  is unique index of USapp_basedate

app_publication_kind

Char

3

83-85

see above

priority_app_num

Char

30

86-115

priority application number as listed in the US file

priority_index

Num

8

116-123

priority_index variable in USapp_for_priority

for_priority_date

Char

10

124-133

date of foreign priority claim

 

Table Showing Distribution of Matches by Year of Chinese Patent Publication

We start with the 1,080,629 foreign priority claims in CNpub_for_priority and select out 757,902 cases where the foreign priority claim is not from the US.  We do this because we expect that matches to the US application file for patents claiming US for priority will be found in file link_USapp_appnum_CNpub_forpri.  (i.e. will go to the application number of the US application, rather than the foreign priority file).  We next select the 608,180 unique values of app_numfix.  For each app_numfix, we ask if the given app_numfix has at least one match in USapp_forpri_CNpub_forpri) and table below reports the results.

 

 

 

Counts

Percent

Not Matched

Matched

Not Matched

Matched

All

221,198

386,982

36.37

63.63

pub_year

88,671

3,084

96.64

3.36

1999 and earlier

2000

14,985

1,743

89.58

10.42

2001

15,483

5,797

72.76

27.24

2002

10,501

13,116

44.46

55.54

2003

7,695

23,625

24.57

75.43

2004

8,285

30,254

21.50

78.50

2005

11,389

49,287

18.77

81.23

2006

11,815

51,140

18.77

81.23

2007

13,157

56,385

18.92

81.08

2008

12,887

50,876

20.21

79.79

2009

14,073

54,740

20.45

79.55

2010

12,257

46,935

20.71

79.29