File Name: USapp_basedat

Observations: 3,280,028

Unique Index: app_publication_num*app_publication_kind

 

Description: The file contains basic information for each US published patent application, published over the years 2000 to 2012.  Note that before 2000, the USPTO did not publish application, only grants were published.  The data was obtained from the bulk download center at Google Patents.  Google has an arrangement with the USPTO to make this data publicly available.  The file below was constructed from the xml files posted at Google.

            The data set is sorted by app_publication_num*app_publication_kind.  Use these two variables to merge in the other data files.

 

Variables

 

Variable

Type

Len

Columns in Ascii File

Description

id_weekdata

Char

9

1-9

The week of the year (indexed from 1 to 52, sometimes 53) that the patent was published.  Google posts files by week, and this variable can be used to determine the original xml file where the application data was posted by Google.

id_weekpat

Num

8

10-17

The order the published patent appears in the weekly data set.

app_publication_num

Char

20

18-37

The application publication number.  In a few cases, a different version an application is published with the same publication number.  Therefore, to obtain a unique index, it is necessary to use application_kind as well as application_num.  That is, merge the other files into this file by application_kind*application_num.

app_publication_kind

Char

3

38-40

See above

app_publication_date

Char

8

41-48

Date application was published (4-digit year, 2-digit month, 2 digit day).

app_number

Char

6

49-54

Six-digit version of the application number

app_ref_type

Char

15

55-69

A code used by the USPTO.

app_filing_date

Char

8

70-77

Date the application was filed (4-digit year, 2-digit month, 2 digit day).

series_code

Char

2

78-79

Series Code

title

Char

320

80-399

Title of patent.