Skip to main content

Data Documentation for National Incident-Based Reporting System (NIBRS)

This document outlines the process for acquiring crime data from the U.S. Federal Bureau of Investigation’s (FBI) Uniform Crime Reporting (UCR) program’s National Incident-Based Reporting System (NIBRS) dataset. It details the approach of the Southern Regional Drug Data Research Centers (SR-DDRC, or just DDRC) to extracting, transforming, and loading (ETL) the data into Microsoft SQL Server, as well as the creation of the dataset’s data dictionary. This guide is intended for data analysts and researchers interested in replicating the SR-DDRC’s process or using the data acquired by the SR-DDRC.

1. Overview of the National Incident-Based Reporting System (NIBRS)

The National Incident-Based Reporting System (NIBRS) became required in 2021 by the Uniform Crime Reporting (UCR) program within the Federal Bureau of Investigation (FBI), aiming to enhance the accuracy of crime statistics by collecting detailed data from local, state, and federal partners.

NIBRS collects incident-based reports that provide specific data with fuller context, replacing the Summary Reporting System (SRS), which collected more limited information. A major difference between the two systems is that NIBRS includes details regarding offense types, property damage, the demographics data when applicable, and weapons used. This enhancement allows NIBRS to offer more useful data on crime compared to the SRS program. More information the two UCR programs can be found the FBI’s UCR page.

Despite its value, the transition from the SRS to NIBRS has presented several challenges to use of the data. While all 50 states utilize the NIBRS program to some extent, some agencies have not yet begun submitting NIBRS data. Additionally, the variations in the timing of agencies' transitions from SRS to NIBRS—some of which began before 2021—have resulted in incomplete yearly datasets. This inconsistency complicates comparisons between data collected before and after 2021. These limitations should be considered when viewing and conducting analyses using this data.

1.1 FBI’s Crime Data Explorer (CDE)

Users can explore the following datasets from the FBI’s Crime Data Explorer (CDE) website, including:

  • Crime Data
    • Crime
    • NIBRS Estimates
    • Hate Crime
    • Expanded Homicide
    • Expanded Property
    • Arrest
    • Quarterly Uniform Crime Report
    • Special Reports
  • Law Enforcement
    • Law Enforcement Employees
    • Law Enforcement Officers Killed and Assaulted
    • National Use-of-Force Data Collection
    • Law Enforcement Public Contact
    • Law Enforcement Suicide Data Collection

Similarly, users can download the following datasets and data tables, along with corresponding data dictionaries and documentation, from the FBI’s CDE's Documents & Downloads page:

  • National Incident-Based Reporting System (NIBRS) Tables
  • Law Enforcement Officers Killed and Assaulted Annual Reports
  • Crime in the United States Annual Reports
  • Crime Incident-Based Data by State
  • Hate Crime Statistics Annual Reports
  • NIBRS Estimation Tables
  • Master File Downloads
  • Additional Datasets
    • Summary Reporting System (SRS)
    • Assaults on Law Enforcement Officers
    • Law Enforcement Employees Data
    • Hate Crime
    • Human Trafficking
    • Uniform Crime Reporting Program Participation Data
    • Cargo Theft
    • U.S. Territory Data
    • Arrest Data - Reported Number of Arrests by Crime
    • Arrest Data - Reported Number of Adult Arrests by Crime
    • Arrest Data - Reported Number of Juvenile Arrests by Crime
    • Arrest Data - Reported Number of Drug Arrests

2. NIBRS Crime Incident-Based Data by State Dataset

The NIBRS Crime Incident-Based Data by State Dataset provides incident-level data per state, with details on arrestees, offenses, offense types, property damage, demographic data when applicable, and weapons used.

3. NIBRS Data Acquisition

This section briefly describes CDE's Documents & Downloads page and SR-DDRC's process for acquiring the data.

3.1 Crime Data Explorer Documents & Downloads

The NIBRS Crime Incident-Based Data by State data files are published on the CDE's Documents & Downloads page as zipped files for each year and state. Datasets can be selected using two dropdown menus, one for "Location" and one for "State." This section also includes options to download the NIBRS data dictionary and data diagram.

Data for the 17 SR-DDRC states was downloaded to a secure terminal server where it subsequently underwent the ETL process.

4. ETL Process

The Extract, Transform, Load (ETL) process is essential for preparing and integrating data from NIBRS into a relational database. This section outlines the steps used by the SR-DDRC. The following instructions assume a basic proficiency with the syntax and structure of the programming language(s) used.

4.1 Data Transformation and Integration

This section outlines the systematic approach employed to transform and integrate raw data files into a structured format suitable for analysis within the SR-DDRC data repository. It details the steps taken to ensure data quality, including the removal of unnecessary elements, the standardization of data formats, and the loading of processed files into staging tables. The procedures described herein facilitate the accurate and efficient transfer of data from staging to production tables, thereby supporting subsequent analytical efforts.

4.1.1 File Inspection and Initial Modifications

The data transformation and integration process begins with inspecting the downloaded files from NIBRS using Notepad++. This inspection aims to identify necessary changes to standardize the files for loading into the SR-DDRC data repository.

One of the key differences in this particular process compared to the process for other datasets is that NIBRS contains multiple files for each state/year combination. There is no guarantee that the file formats will remain consistent from one year to the next, making inspection a critical part of the import process.

After downloading, the files were saved to the shared network drive with the following directory structure:

  • YYYY (year in four digit format)
    • State Zip File named in format ST-YYYY.zip where ST is state abbreviation
      • State File unzipped (named ST where ST is state abbreviation)
        • All CSV files for the state

Opening the file in Notepad++ with the "Show All Characters" option (see Image 1 below) shows it is comma-separated, with text fields enclosed in double quotes.

Image 1: Screenshot displaying the NIBRS data file in Notepad++ with the "Show All Characters" option enabled, revealing a comma-separated values (CSV) format with text fields enclosed in double quotes.

NIBRS also includes a file in each downloaded zip file named “postgres_setup.sql.” This file is invaluable as its contents can be pasted into a generative AI tool, which can convert the code from PostgreSQL to Transact SQL (or another database platform). This conversion saves considerable time, by eliminating the need to manually type the CREATE TABLE commands.

Once the staging tables are created, the SQL Server’s BULK INSERT statement can be used to populate the tables directly from the text file. An example statement is as follows:

BULK INSERT <Your Staging Table Name>
FROM '<Full path to your file>' 
WITH (FORMAT='CSV', FIRSTROW = 2, ROWTERMINATOR = '0x0a');

A statement similar to this should be executed for each state/year combination that needs to be loaded into the database.

Finally, the production tables can be populated from the staging tables, converting each field’s values as necessary. A sample script for accomplishing this with the agencies table is provided below; this can be replicated for all other files/tables. The script performs a TRY_CONVERT on each field to convert it into the production table format and adds two additional fields. The first additional field is "Source," which holds text describing the source file, and the second field captures the results of the GETDATE() function, which provides a timestamp of when the row was inserted into the table.

INSERT INTO [YourProductionDatabaseName].[dbo].[YourProductionTableName] 
	([yearly_agency_id]
	,[agency_id]
	,[data_year]
	,[ori]
	,[legacy_ori]
	,[covered_by_legacy_ori]
	,[direct_contributor_flag]
	,[dormant_flag]
	,[dormant_year]
	,[reporting_type]
	,[ucr_agency_name]
	,[ncic_agency_name]
	,[pub_agency_name]
	,[pub_agency_unit]
	,[agency_status]
	,[state_id]
	,[state_name]
	,[state_abbr]
	,[state_postal_abbr]
	,[division_code]
	,[division_name]
	,[region_code]
	,[region_name]
	,[region_desc]
	,[agency_type_name]
	,[population]
	,[submitting_agency_id]
	,[sai]
	,[submitting_agency_name]
	,[suburban_area_flag]
	,[population_group_id]
	,[population_group_code]
	,[population_group_desc]
	,[parent_pop_group_code]
	,[parent_pop_group_desc]
	,[mip_flag]
	,[pop_sort_order]
	,[summary_rape_def]
	,[pe_reported_flag]
	,[male_officer]
	,[male_civilian]
	,[male_total]
	,[female_officer]
	,[female_civilian]
	,[female_total]
	,[officer_rate]
	,[employee_rate]
	,[nibrs_cert_date]
	,[nibrs_start_date]
	,[nibrs_leoka_start_date]
	,[nibrs_ct_start_date]
	,[nibrs_multi_bias_start_date]
	,[nibrs_off_eth_start_date]
	,[covered_flag]
	,[county_name]
	,[msa_name]
	,[publishable_flag]
	,[participated]
	,[nibrs_participated]
	,[source]
	,[record_created_date]
	)
SELECT
	try_convert(int,yearly_agency_id) AS 'yearly_agency_id'
	,try_convert(int, agency_id) AS 'agency_id'
	,try_convert(int, data_year) AS 'data_year'
	,try_convert(varchar(25),ori) AS 'oir'
	,try_convert(varchar(25),legacy_ori) AS 'legacy_ori'
	,try_convert(varchar(25),covered_by_legacy_ori) AS 'covered_by_legacy_ori'
	,try_convert(varchar(1),direct_contributor_flag) AS 'direct_contributor_flag'
	,try_convert(varchar(1),dormant_flag) AS 'dormant flag'
	,try_convert(int,dormant_year) AS 'dormant_year'
	,try_convert(varchar(1),reporting_type) AS 'reporting_type'
	,try_convert(varchar(100),ucr_agency_name) AS 'ucr_agency_name'
	,try_convert(varchar(100),ncic_agency_name) AS 'ncic_agency_name'
	,try_convert(varchar(100),pub_agency_name) AS 'pub_agency_name'
	,try_convert(varchar(100),pub_agency_unit) AS 'pub_agency_unit'
	,try_convert(varchar(1),agency_status) AS 'agency_status'
	,try_convert(int,state_id) AS 'state_id'
	,try_convert(varchar(100),state_name) AS 'state_name'
	,try_convert(varchar(2),state_abbr) AS 'state_abbr'
	,try_convert(varchar(2),state_postal_abbr) AS 'state_postal_abbr'
	,try_convert(int,division_code) AS 'division_code'
	,try_convert(varchar(100),division_name) AS 'division_name'
	,try_convert(int,region_code) AS 'region_code'
	,try_convert(varchar(100),region_name) AS 'region_name'
	,try_convert(varchar(100),region_desc) AS 'region_desc'
	,try_convert(varchar(100),agency_type_name) AS 'agency_type_name'
	,try_convert(int,population) AS 'population'
	,try_convert(int,submitting_agency_id) AS 'submitting_agency_id'
	,try_convert(varchar(25),sai) AS 'sai'
	,try_convert(varchar(200),submitting_agency_name) AS 'submitting_agency_name'
	,try_convert(varchar(1),suburban_area_flag) AS 'suburban_area_flag'
	,try_convert(int,population_group_id) AS 'population_group_id'
	,try_convert(varchar(2),population_group_code) AS 'population_group_code'
	,try_convert(varchar(100),population_group_desc) AS 'population_group_desc'
	,try_convert(int,parent_pop_group_code) AS 'parent_pop_group_code'
	,try_convert(varchar(100), parent_pop_group_desc) AS 'parent_pop_group_desc'
	,try_convert(varchar(1),mip_flag) AS 'mip_flag'
	,try_convert(int,pop_sort_order) AS 'pop_sort_order'
	,try_convert(varchar(1),summary_rape_def) AS 'summary_rape_def'
	,try_convert(varchar(1),pe_reported_flag) AS 'pe_reported_flag'
	,try_convert(int,male_officer) AS 'male_officer'
	,try_convert(int,male_civilian) AS 'male_civilian'
	,try_convert(int,male_total) AS 'male_total'
	,try_convert(int,female_officer) AS 'female_officer'
	,try_convert(int,female_civilian) AS 'female_civilian'
	,try_convert(int,female_total) AS 'female_total'
	,try_convert(decimal(18,0),officer_rate) AS 'officer_rate'
	,try_convert(decimal(18,0),employee_rate) AS 'employee_rate'
	,try_convert(date,nibrs_cert_date) AS 'nibrs_cert_date'
	,try_convert(date,nibrs_start_date) AS 'nibrs_start_date'
	,try_convert(date,nibrs_leoka_start_date) AS 'nibrs_leoka_start_date'
	,try_convert(date,nibrs_ct_start_date) AS 'nibrs_ct_start_date'
	,try_convert(date,nibrs_multi_biAS_start_date) AS 'nibrs_multi_biAS_start_date'
	,try_convert(date,nibrs_off_eth_start_date) AS 'nibrs_off_eth_start_date'
	,try_convert(varchar(1),covered_flag) AS 'covered_flag'
	,try_convert(varchar(100),county_name) AS 'county_name'
	,try_convert(varchar(100),msa_name) AS 'msa_name'
	,try_convert(varchar(1),publishable_flag) AS 'publishable_flag'
	,try_convert(varchar(1),participated) AS 'participated'
	,try_convert(varchar(1),nibrs_participated) AS 'nibrs_participated'
	,'' AS 'source'
	,GETDATE() AS 'record_created_date'
FROM [YourStagingDatabaseName].[dbo].[YourStagingTableName]

5. DDRC Data Dictionary

The FBI’s UCR program provides a data dictionary and data diagram for NIBRS on the CDE's Documents & Downloads page. To assist users, SR-DDRC has uploaded a copy of the NIBRS Data Dictionary PDF downloaded from the CDE. However, users are encouraged to review the most current and detailed documentation directly on the CDE's Documents & Downloads page, as the provided copy reflects information as of the time it was downloaded by the DDRC.

If you have additional questions about this dataset, please contact us at info@srddrc.com