Database Guide
Document Collections
The History Lab’s database organizes its various texts under the broader collections they were originally a part of. Here is a list of the various accessible collections that have been digitized and can be accessed through the API:
Collection Code |
Collection Name |
Document Count |
|---|---|---|
frus |
Foreign Relations of the United States |
209,046 |
cia |
Records from the CIA Freedom of Information Act Reading Room |
935,716 |
clinton |
Hillary Clinton Emails |
54,149 |
pdb |
President’s Daily Briefs |
5,011 |
cfpf |
Central Foreign Policy Files |
3,214,293 |
kissinger |
Henry Kissinger Telephone Conversations |
4,552 |
nato |
NATO Digital Archives |
46,002 |
This information can also be viewed directly through the API using the list_collections() function.
Document Fields
Each individual data point in the History Lab’s database represents a single text, and the fields in each data row represent different pieces of information about each text that can be extracted by users through the API. Here is a list of some of the common fields:
Field Name |
Field Description |
|---|---|
authored |
Date and time stamp of when the document was authored |
body |
The full text body of the document |
countries |
A list of all the countries mentioned in the document |
persons |
A list of all the persons mentioned in the document |
persons_id |
Lists the unique IDs for all the persons that appear in the document |
topics |
A list of all the topics mentioned in the document |
title |
The text title of the document |
doc_id |
The document’s unique ID, also contains information on which collection the document belongs to |