File Name: USapp_basedat
Observations:
3,280,028
Unique Index:
app_publication_num*app_publication_kind
Description:
The file contains basic information for each US published patent application,
published over the years 2000 to 2012.
Note that before 2000, the USPTO did not publish application, only
grants were published. The data was
obtained from the bulk download center at Google Patents. Google has an arrangement with the USPTO to
make this data publicly available. The
file below was constructed from the xml files posted at Google.
The data set is sorted by app_publication_num*app_publication_kind. Use these two variables to merge in the other
data files.
Variables
Variable |
Type |
Len |
Columns in Ascii File |
Description |
id_weekdata |
Char |
9 |
1-9 |
The week of the year (indexed from 1 to 52, sometimes 53) that the patent was published. Google posts files by week, and this variable can be used to determine the original xml file where the application data was posted by Google. |
id_weekpat |
Num |
8 |
10-17 |
The order the published patent appears in the weekly data set. |
app_publication_num |
Char |
20 |
18-37 |
The application publication number. In a few cases, a different version an application is published with the same publication number. Therefore, to obtain a unique index, it is necessary to use application_kind as well as application_num. That is, merge the other files into this file by application_kind*application_num. |
app_publication_kind |
Char |
3 |
38-40 |
See above |
app_publication_date |
Char |
8 |
41-48 |
Date application was published (4-digit year, 2-digit month, 2 digit day). |
app_number |
Char |
6 |
49-54 |
Six-digit version of the application number |
app_ref_type |
Char |
15 |
55-69 |
A code used by the USPTO. |
app_filing_date |
Char |
8 |
70-77 |
Date the application was filed (4-digit year, 2-digit month, 2 digit day). |
series_code |
Char |
2 |
78-79 |
Series Code |
title |
Char |
320 |
80-399 |
Title of patent. |