The annual National Fire Incident Reporting System (NFIRS) Public Data Release (PDR) files are provided by the U.S. Fire Administration’s (USFA) National Fire Data Center (NFDC). The NFIRS is a reporting standard that fire departments use to uniformly report on the full range of their activities, from fire to emergency medical services (EMS) to equipment involved in the response.
NFIRS is the nation’s largest, national, annual database of fire incident information. NFIRS is a voluntary tool with two objectives: to help State and local governments develop fire reporting and analysis capability for their own use and to obtain data that can be used to more accurately assess and subsequently combat the fire problem at a national level.
The PDR files represent annual archives of NFIRS data. Downloadable files are available from the Annual NFIRS Public Data download page.
The NFIRS PDR is essentially a large relational database - tables having been exported as delimited text files. Each year of data is released as a separate archive. Uncompressed, each year contains more than 65 million records and requires about 12 to 15 GB of storage, making the data challenging to use. Data cannot simply be downloaded and viewed in a spreadsheet. While it may be possible to query data directly from text files through code, a more useful approach is to rebuild or reconstitute the relational database.
This repository contains a series of scripts to partially automate the reconstitution of the NFIRS database for a given year. Three options are provided:
- ANSI SQL - Generic scripts that should work with the most common RDBMS software (PostgreSQL, MySQL, MS SQL Server, Oracle) with minimal modifications. The data import scripts will most likely require minor modifications as different databases use different tools and different command syntax for bulk imports.
- Sqlite3 - This database was chosen for its simplicity. It is free, small, easy to install, and requires almost no configuration to enable. If you are not sure which option to choose, Sqlite is recommended.
- DuckDB - A fast, in-process database system that supports many programming languages. Some file formats can be queried directly with standard SQL syntax.
Important
These reconstitution scripts have been created for PDR submission years 2012 through 2024 where data is provided as a series of delimited text files. Prior to 2012, data is released in a dBase (.dbf) file format.
Important
The NFIRS PDR data is subject to a certain degree of human error. It is recommended that you always review the "release notes readme" and the "data quality notes" documents for each submission that you choose to work with. These documents will explain known data inconsistencies.
Important
Many of the yearly submissions suffer from referential integrity issues. Enforcing primary and foreign key constraints, help us identify issues. Some RDBMS's, such as sqlite, can skip problematic and duplicate records. Others will not. If your RDBMS gets stuck due to constraint issues or if you want all the records, remove the constraints.
Note
FEMA does not endorse any of the tools or products mentioned in this guide. FEMA makes no claim as to the assurance of their efficacy, security, or reliability. They are used here for illustrative purposes only.
Three scripts are provided for each of the options mentioned above for two vintages (pre2019 and all others). They are organized in folders and differentiated by a file suffix. Each of the scripts contain database specific notes, prerequisites, run instructions, and possible issues. It is important to review the notes in these scripts before running them. It is also important to ensure each script has run successfully before proceeding to the next step.
- NFIRS_Database_Creation-??.sql - Contains the data definition language (DDL) needed to create the NFIRS database and tables. No data is loaded at this step.
- NFIRS_Data_Import-??.sql - This script will import the raw data into the created database along with indexes to ensure decent performance. Each year file will consume 12 to 15 GB - make sure you have enough storage space available before running.
- NFIRS_Database_Cleanup-??.sql - Performs minimal post-processing on the loaded data. This will vary depending on the database option chosen. Additionally, a useful view is created containing the most commonly sought after fields - often referred to as the NFIRS PDR Lite.
In 2018, the database schema changed such that the "INCIDENT_KEY" value did not exist - only its constituant parts exist. The series of scripts identified as "Pre2019" should be used to process NFIRS files prior to 2019. The "Database Cleanup" scripts will re-create the key for compatibility (if desired) with later releases.
- NFIRS_Tutorial.ipynb - A Jupyter notebook illustrating the use of Sqlite and DuckDB reconstituted NFIRS databases. While the examples are in Python, they should be easily translatible to other languages. Note that this is a tutorial on the use of the database - it does not demonstrate the reconstitution itself.
Although the PDR release archive contains a relationship diagram, a version is presented here for convenience. This diagram represents the cardinality between tables more accurately than the diagram included in the PDR.
erDiagram
%%direction LR --LQL works in VS Code but not GitHub yet
direction LR
FDHeader ||--|{ BasicIncident : "state, fdid"
BasicIncident ||--o| Arson : incident_key
Arson ||--o{ ArsonJubSub : incident_key
Arson ||--o| ArsonAgencyReferal : incident_key
BasicIncident ||--o| FireIncident : incident_key
BasicIncident ||--o{ EMS : incident_key
BasicIncident ||--o| Wildlands : incident_key
BasicIncident ||--o| HazMat : incident_key
BasicIncident ||--|| IncidentAddress : incident_key
BasicIncident ||--o| BasicAid : incident_key
BasicIncident ||--o{ FFCasualty : incident_key
FFCasualty ||--o{ FFEquipFail : "incident_key, cas_seq_no"
BasicIncident ||--o| Causes : incident_key
BasicIncident ||--o{ CivilianCasualty : incident_key
HazMat ||--o{ HazChem : incident_key
HazMat ||--|| HazMobProp : incident_key
HazMat ||--|| HazMatEquipInvolved : incident_key
Prepare for NFIRS file download and script execution. Make sure you have enough space available. The files downloaded and extracted will require about 12 GB. The creation of the database will require another 12 to 15 GB.
- Create a folder in which to save the NFIRS files and database
- Navigate to the NFIRS download page and download an archive for a desired year
- Extract the NFIRS files to the folder you created
- Download and save the Database Reconstitution scripts for your chosen option (sqlite3, ANSI SQL, DuckDb) and vintage (pre2019 or after) to the folder you created
- IMPORTANT: Because each year of data is stored in a unique folder, create a common location to store the raw data. Otherwise, it would be necessary to modify the import scripts every time a different year were processed.
- Create another folder within your NFIRS folder to hold the raw data. Call this folder
rawdata. - When the NFIRS data is extracted, it is put in a folder called
nfirs_all_incident_pdr_[year of download]/NFIRS_[year of download]_[date archive produced]. Navigate to this folder and move the text files containing the NFIRS data to this new rawdata folder. - Likewise, structure fire cause data is located in a subfolder called
StructureFireCauses[year of download]. Navigate to this folder and move the text files containing the NFIRS data to this new rawdata folder.
- Create another folder within your NFIRS folder to hold the raw data. Call this folder
- Follow the instructions below and execute the scripts according to the option you have chosen.
To install Sqlite3, download the pre-compiled binaries, extract the files, and add the directory containing the SQLite3 executable to your system's PATH environment variable. If you need more detailed instructions, plenty of written and video tutorials are available online illustrating sqlite3 installation.
- Download the pre-compiled binaries for your operating system from the SQLite download page
- Unzip/uncompress the downloaded file into a directory of your choice. Inside the extracted folder, you'll find the SQLite3 executable and other related tools.
- The path to the executable must be added to the System PATH. Instructions on how to do this can be found online for your specific operating system.
- Verify the installation by typing the following line in a command prompt or terminal window:
sqlite3 --versionThe sqlite3 version should be displayed.
Important
For the Windows version of Sqlite3, both the DLL and the Tools binaries must be downloaded. The Sqlite3 executable will be in the tools folder.
While you can execute queries from a command line, it is easier to access the database using a Sqlite3 GUI tool such as DB Browser for Sqlite. Many such tools exist. This one is open source, free, and lightweight. Go to the DB Browser for Sqlite download page and follow the instructions for your operating system.
The scripts must be executed in order and each should only be run once. If an error occurs at any step requiring changes to the scripts, it is easiest to delete the database file (if created) and start over. The scripts will 1) create an empty database, 2) import the NFIRS data into the database, and 3) perform some post-import cleanup actions on the data. The scripts could be combined, but it is easier to verify results and troubleshoot problems following smaller steps.
In tests (on a Windows 10 computer with an i7 processor, 32 GB RAM, and an SSD), the database creation script completed immediately, the data import took about 7 minutes, and the cleanup script took about 8 minutes.
It is strongly recommended that you review the notes within each script file. **NOTE: Modify these command names if running Pre-2019 scripts.
-
Open a command window or terminal
-
Navigate to your database folder that will eventually contain the database (the one that contains the scripts and rawdata folder)
-
Execute the database creation script (in this version, we are redirecting output and errors to a file for later review):
sqlite3 nfirs.db < NFIRS_Database_Creation-Sqlite3.sql > database_creation_sqlite.log 2>&1 -
Verify the creation of the database by reviewing the log output (database_creation_sqlite.log)
-
Verify the creation of the database by connecting to the database (a connection to the database should occur; the tables should be listed):
command prompt> sqlite3 sqlite> .open nfirs.db sqlite> .tables sqlite> .exit command prompt> -
Execute the data import script:
sqlite3 nfirs.db < NFIRS_Data_Import-Sqlite3.sql > data_import_sqlite.log 2>&1 -
Verify the data import by reviewing the log output (data_import_sqlite.log)
-
Compare the table record counts from the log with record counts presented in the "About NFIRS & Release Notes" document as part of the NFIRS download.
-
Verify the import by randomly querying data (you can use a third-party tool to query the database rather than the command line utility):
command prompt> sqlite3 sqlite> .open nfirs.db sqlite> select * from fdheader limit 10; sqlite> .exit command prompt> -
Execute the data cleanup script (OPTIONAL):
sqlite3 nfirs.db < NFIRS_Database_Cleanup-Sqlite3.sql > cleanup_sqlite.log 2>&1 -
Verify the data import by reviewing the log output (cleanup_sqlite.log)
-
Verify the cleanup script by querying something with a date and the PDR Lite view that should have been created (you can use a third-party tool to query the database rather than the command line utility). Look at one of the dates to see if it converted to ISO 8601 format:
command prompt> sqlite3 sqlite> .open nfirs.db sqlite> select * from vPDRLite limit 10; sqlite> .exit command prompt>
Navigate to the DuckDB Download page and follow the instructions. For purposes of reconstituting the NFIRS database, we will install the command line environment. In the future, you can install libraries for a specific language if you choose to perform analysis on the database through code. If installing on Mac or Windows operating systems using a package manager, the path will be adjusted automatically. If you need more detailed instructions, plenty of written and video tutorials are available online illustrating DuckDB installation.
To install DuckDB, download the pre-compiled binaries, extract the files, and add the directory containing the DuckDB executable to your system's PATH environment variable.
If not using a package manager:
- Download the pre-compiled binaries for your operating system from the DuckDB Download page
- Unzip/uncompress the downloaded file into a directory of your choice. Inside the extracted folder, you'll find the DuckDB executable
- The path to the executable must be added to the System PATH. Instructions on how to do this can be found online for your specific operating system.
- Verify the installation by typing the following line in a command prompt or terminal window:
duckdb --version
While you can execute queries from a command line, it is easier to access the database using a GUI tool. Not many such tools exist. DuckDB contains a built-in, web-based user interface that can be launched by starting the DuckDB command line interface with the -ui swich: duckdb -ui. Click on the link that appears when the CLI is started if it does not automatically open within a browser.
By default, DuckDB is an in-memory process and won't persist any data. Once we create a database in the CLI, it will disappear when the CLI is closed. To persist data, we will provide a path to a file where our database structure and data will be stored.
The scripts must be executed in order and each should only be run once. If an error occurs at any step requiring changes to the scripts, it is easiest to delete the database file (if created) and start over. The scripts will 1) create an empty database, 2) import the NFIRS data into the database, and 3) perform some post-import cleanup actions on the data. The scripts could be combined, but it is easier to verify results and troubleshoot problems following smaller steps.
In tests (on a Windows 10 computer with an i7 processor, 32 GB RAM, and an SSD), the database creation script completed immediately, the data import took about 11 minutes, and the creation and cleanup scripts took only seconds.
It is strongly recommended that you review the notes within each script file
-
Open a command window or terminal
-
Navigate to your database folder that will eventually contain the database
-
Execute the database creation script (in this version, we are redirecting output and errors to a file for later review):
duckdb nfirs.duckdb < NFIRS_Database_Creation-DuckDB.sql > database_creation_duck.log 2>&1 -
Verify the creation of the database by reviewing the log output (database_creation_duck.log)
-
Verify the creation of the database by connecting to the database (a connection to the database should occur; the tables should be listed):
command prompt> duckdb D .open nfirs.duckdb D .tables D .exit command prompt> -
Execute the data import script:
duckdb nfirs.duckdb < NFIRS_Data_Import-DuckDB.sql > data_import_duck.log 2>&1 -
Verify the data import by reviewing the log output (data_import_duck.log)
-
Compare the table record counts from the log with record counts presented in the "About NFIRS & Release Notes" document as part of the NFIRS download.
-
Verify the import by randomly querying data (you can use the DuckDB UI to query the database rather than the command line utility):
command prompt> duckdb D .open nfirs.duckdb D select * from fdheader limit 10; D .exit command prompt> -
Execute the data cleanup script (OPTIONAL). This script only creates the PDR Lite view. In tests, its performance is awful. Sqlite offers much better performance for this view:
duckdb nfirs.duckdb < NFIRS_Database_Cleanup-DuckDB.sql > cleanup_duck.log 2>&1 -
Verify the data import by reviewing the log output (cleanup_duck.log)
-
Verify the cleanup script by querying the DuckDB views, the creation of which is the only activity in the DuckDB cleanup script. You could run the view but as mentioned above, the performance is unusually bad:
command prompt> sqlite3 D .open nfirs.db D select * from duckdb_views() where database_name = 'nfirs'; D .exit command prompt>
Provided are generic scripts that should work with the most common RDBMS software (PostgreSQL, MySQL, MS SQL Server, Oracle) with minimal modifications. The data import scripts will most likely require minor modifications as different databases use different tools and different command syntax for bulk imports.
Note
If you modify these scripts to work with a specific RDBMS and they have been tested, please consider sharing with the community by issuing a pull request or by sending an email to openfema@fema.dhs.gov.
Read the documentation within each script. They cannot be run without modification. Modify the scripts as appropriate to work with your specific RDBMS. Execute the scripts according to your RDBMS's documentation. Run the Database Creation script first followed by the Data Import script, and finally the optional Database Cleanup script if necessary.
Regardless of the database flavor chosen, if the resulting database was successfully created, you can delete the files in the rawdata folder to save space.
We encourage your feedback and contributions to this repository. Creating and testing scripts for specific databases (MySQL, MS SQL Server, Oracle, etc.) would be a great addition to this project.
If you modify these scripts to work with a specific RDBMS and they have been tested, please consider sharing with the community by issuing a pull request or by sending an email to openfema@fema.dhs.gov.
Content suggestions and discussions (specific to NFIRS or OpenFEMA ) can be communicated in the following ways:
- GitHub “issues.” Each issue is a conversation about specific project work initiated by a member of the public.
- GitHub "discussions". Each discussion is a project communication forum. Discussions are not specific to elements of work like a pull request. We encourage you to browse and join in on discussions or start a new conversation by creating a new discussion.
- Direct changes and line edits to the content may be submitted through a "pull request" by clicking "Edit this page" on any site page in the repository. You do not need to install any software to suggest a change. You can use GitHub's in-browser editor to edit files and submit a pull request for your changes to be merged into the document. Directions on how to submit a pull request can be found on GitHub.
- Send your content suggestions or proposed revisions to the OpenFEMA team via email to openfema@fema.dhs.gov. Please note that all comments received may or may not be posted publicly on this page.
We encourage you to read this project's Code of Conduct, CONTRIBUTING policy, and its LICENSE.
This project is in the worldwide public domain. As stated in CONTRIBUTING:
This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.
All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.