Skip to main content

Data Documentation for CDC NCHS Vital Statistics Mortality Data

This document outlines the process for acquiring United States (US) mortality data from the Centers for Disease Control and Prevention's (CDC) Vital Statistics Online Data Portal’s “Mortality Multiple Cause File” dataset. It details the approach of the Southern Regional Drug Data Research Centers (SR-DDRC, or just DDRC) to extracting, transforming, and loading this data into Microsoft SQL Server, as well as the creation of the dataset’s data dictionary. This guide is intended for data analysts and researchers interested in replicating the SR-DDRC’s process or utilizing the data acquired by the SR-DDRC.

1. Overview of the Vital Statistics Data and Online Data Portal

The Vital Statistics Online Data Portal provides data from the National Vital Statistics System (NVSS), managed by the National Center for Health Statistics (NCHS) within the Centers for Disease Control and Prevention (CDC). The Vital Statistics data portal provides annual national birth and death information, including details on age, gender, race, and causes of death, to support informed public health decisions and improve life expectancy for the nation and U.S. territories.

1.1 Vital Statistics Data Available on the Online Data Portal

Users can download the following datasets, along with their corresponding data dictionaries. Datasets are available for the United States, as well as its territories

  • Birth Data Files
  • Period/Cohort Linked Birth-Infant Death Data Files
  • Period Linked Birth-Infant Death Data Files
  • Birth Cohort Linked Birth – Infant Death Data Files
  • Matched Multiple Birth and Fetal Death Data Files
  • Mortality Multiple Cause Files
  • Fetal Death Data Files

2. Vital Statistics Mortality Multiple Cause Files

The Vital Statistics Online Data Portal provides mortality data annually for the United States and its territories. The data features variables such as date of death, age, gender, race, ethnicity, education, and causes of death.

2.1 Data Use Guidelines and Confidentiality Standards

Data from the Vital Statistics Online Data Portal is subject to the CDC and NCHS data use restrictions as outlined in their Data User Agreement. Restrictions include that the dataset is to be used exclusively for statistical reporting and analysis. Users are prohibited from linking it to other datasets to identify individuals or establishments included in the dataset. Any inadvertent identification must be reported immediately to the SR-DDRC and promptly reported to the CDC.

To protect confidentiality, NCHS removes direct identifiers and any characteristics that could lead to identification, including geographic location.

We comply with these terms and expect that any third parties accessing or using data from the SR-DDRC will do the same.

2.2 Understanding Mortality Causes and ICD-10

Mortality causes are defined using the International Classification of Diseases 10th Revision (ICD-10). For more information on the ICD-10 CM system, please refer to the CDC’s ICD-10 CM page.

3. Vital Statistics Mortality Data Acquisition

This section describes the Vital Statistics Online Data Portal and the DDRC's process for acquiring the data.

3.1 Vital Statistics Online Data Portal

Data files are published on the portal as zipped files for each year, with separate files available for the US and its territories. “Public Use Data File Documentation,” which includes a data dictionary, is published for each year of data as PDFs by CDC NCHS.

Mortality data for the US, spanning the years 2010-2022, was downloaded to a terminal server, where it underwent the extract-transform-load (ETL) process.

4. ETL Process

The Extract, Transform, Load (ETL) process is essential for preparing and integrating data from Vital Statistics Online Data Portal into a relational database. This section outlines the steps used by the SR-DDRC. The following instructions assume a basic proficiency with the syntax and structure of the programming language(s) used.

4.1 Vital Statistics Mortality Multiple Cause Files

Each raw data file downloaded from the Vital Statistics Data Portal has a separate accompanying supplemental file that serves as a data dictionary. The data files are formatted as fixed-width files, with the widths for each dataset defined in the corresponding “Public Use Data File Documentation” file published by CDC NCHS.

4.2 Data Transformation and Integration

This section outlines the systematic approach employed to transform and integrate raw data files into a structured format suitable for analysis within the SR-DDRC data repository. It details the steps taken to ensure data quality, including the removal of unnecessary elements, the standardization of data formats, and the loading of processed files into staging tables. The procedures described herein facilitate the accurate and efficient transfer of data from staging to production tables, thereby supporting subsequent analytical efforts.

4.2.1 File Inspection and Initial Modifications

The data transformation and integration process begins by inspecting the CDC Vital Statistics mortality data .zip files obtained from the data portal. First, the .zip files are extracted using 7-Zip or a similar file extraction utility. For reference, the 2021 zip file is approximately 156 MB in size (see Image 1).

Image 1: Screenshot displaying the size of the compressed file for the CDC Vital Statistics Mortality data (2021) for user evaluation.

Upon extraction, the uncompressed text file for 2021 is roughly 2.64 GB (see Image 2).

Image 2: Screenshot displaying the size of the uncompressed text file for the CDC Vital Statistics Mortality data (2021) for user evaluation.

When opening this file in Notepad++, a file size warning appears due to the large file volume (see Image 3).

Image 3: Screenshot showing the file size warning that appears in Notepad++ due to the large volume of the uncompressed CDC Vital Statistics Mortality data (2021) text file.

With the “Show All Characters” option enabled in Notepad++, the file structure reveals a fixed-width format (see Image 4), rather than a comma-separated or tab-delimited format. Each field begins and ends at specific positions, as defined in the CDC Vital Statistics data dictionary for each data year, with the end-of-row character positioned at character 818.

Image 4: Screenshot showing the file structure of the CDC Vital Statistics Mortality data (2021) in Notepad++ with the “Show All Characters” option enabled, revealing a fixed-width format.

Following inspection, the fixed-width file is loaded into a staging table within the database. While various methods are available for this process, a Python program is used here to streamline data handling. First, staging tables are created in the database; due to the size of the files, one staging table per year was created. Given that maximum field lengths are predefined, this information is incorporated into the table schema rather than defaulting to a generic length of 1000 characters, as commonly applied in other staging tables. Note: The following codes examples feature the 2021 Vital Statistics Mortality file structure.

CREATE TABLE [YourStagingDatabaseName].[dbo].[YourStagingTableName](  
[RecordType] [nvarchar](1) NULL,  
[ResidentStatus] [nvarchar](1) NULL,  
[Education1989] [nvarchar](2) NULL,  
[Education2003] [nvarchar](1) NULL,  
[EducationReportingFlag] [nvarchar](11) NULL,  
[MonthOfDeath] [nvarchar](2) NULL,  
[Sex] [nvarchar](11) NULL,  
[DetailAge] [nvarchar](4) NULL,  
[AgeSubstitutionFlag] [nvarchar](1) NULL,  
[AgeRecode52] [nvarchar](2) NULL,  
[AgeRecode27] [nvarchar](2) NULL,  
[AgeRecode12] [nvarchar](2) NULL,  
[InfantAgeRecode22] [nvarchar](2) NULL,  
[PlaceofDeathandDecendentsStatus] [nvarchar](1) NULL,  
[MaritalStatus] [nvarchar](1) NULL,  
[DayofWeekofDeath] [nvarchar](1) NULL,  
[CurrentDataYear] [nvarchar](4) NULL,  
[InjuryatWork] [nvarchar](1) NULL,  
[MannerofDeath] [nvarchar](1) NULL,  
[MethodofDisposition] [nvarchar](1) NULL,  
[Autopsy] [nvarchar](1) NULL,  
[ActivityCode] [nvarchar](1) NULL,  
[PlaceofInjuryforCauses] [nvarchar](1) NULL,  
[ICDCode] [nvarchar](4) NULL,  
[358CauseRecode] [nvarchar](3) NULL,  
[113CauseRecode] [nvarchar](3) NULL,  
[130InfantCauseRecode] [nvarchar](3) NULL,  
[39CauseRecode] [nvarchar](2) NULL,  
[NumberofEntityAxisConditions] [nvarchar](2) NULL,  
[ConditionPart01] [nvarchar](1) NULL,  
[ConditionSequence01] [nvarchar](1) NULL,  
[ConditionCode01] [nvarchar](5) NULL,  
[ConditionPart02] [nvarchar](1) NULL,  
[ConditionSequence02] [nvarchar](1) NULL,  
[ConditionCode02] [nvarchar](5) NULL,  
[ConditionPart03] [nvarchar](1) NULL,  
[ConditionSequence03] [nvarchar](1) NULL,  
[ConditionCode03] [nvarchar](5) NULL,  
[ConditionPart04] [nvarchar](1) NULL,  
[ConditionSequence04] [nvarchar](1) NULL,  
[ConditionCode04] [nvarchar](5) NULL,  
[ConditionPart05] [nvarchar](1) NULL,  
[ConditionSequence05] [nvarchar](1) NULL,  
[ConditionCode05] [nvarchar](5) NULL,  
[ConditionPart06] [nvarchar](1) NULL,  
[ConditionSequence06] [nvarchar](1) NULL,  
[ConditionCode06] [nvarchar](5) NULL,  
[ConditionPart07] [nvarchar](1) NULL,  
[ConditionSequence07] [nvarchar](1) NULL,  
[ConditionCode07] [nvarchar](5) NULL,  
[ConditionPart08] [nvarchar](1) NULL,  
[ConditionSequence08] [nvarchar](1) NULL,  
[ConditionCode08] [nvarchar](5) NULL,  
[ConditionPart09] [nvarchar](1) NULL,  
[ConditionSequence09] [nvarchar](1) NULL,  
[ConditionCode09] [nvarchar](5) NULL,  
[ConditionPart10] [nvarchar](1) NULL,  
[ConditionSequence10] [nvarchar](1) NULL,  
[ConditionCode10] [nvarchar](5) NULL,  
[ConditionPart11] [nvarchar](1) NULL,  
[ConditionSequence11] [nvarchar](1) NULL,  
[ConditionCode11] [nvarchar](5) NULL,  
[ConditionPart12] [nvarchar](1) NULL,  
[ConditionSequence12] [nvarchar](1) NULL,  
[ConditionCode12] [nvarchar](5) NULL,  
[ConditionPart13] [nvarchar](1) NULL,  
[ConditionSequence13] [nvarchar](1) NULL, 
[ConditionCode13] [nvarchar](5) NULL,  
[ConditionPart14] [nvarchar](1) NULL,  
[ConditionSequence14] [nvarchar](1) NULL,  
[ConditionCode14] [nvarchar](5) NULL,  
[ConditionPart15] [nvarchar](1) NULL,  
[ConditionSequence15] [nvarchar](1) NULL,  
[ConditionCode15] [nvarchar](5) NULL,  
[ConditionPart16] [nvarchar](1) NULL,  
[ConditionSequence16] [nvarchar](1) NULL,  
[ConditionCode16] [nvarchar](5) NULL,  
[ConditionPart17] [nvarchar](1) NULL,  
[ConditionSequence17] [nvarchar](1) NULL,  
[ConditionCode17] [nvarchar](5) NULL,  
[ConditionPart18] [nvarchar](1) NULL,  
[ConditionSequence18] [nvarchar](1) NULL,  
[ConditionCode18] [nvarchar](5) NULL,  
[ConditionPart19] [nvarchar](1) NULL,  
[ConditionSequence19] [nvarchar](1) NULL,  
[ConditionCode19] [nvarchar](5) NULL,  
[ConditionPart20] [nvarchar](1) NULL,  
[ConditionSequence20] [nvarchar](1) NULL,  
[ConditionCode20] [nvarchar](5) NULL,  
[NumberofRecordAxisConditions] [nvarchar](2) NULL,  
[RecordAxisCondition01] [nvarchar](5) NULL,  
[RecordAxisCondition02] [nvarchar](5) NULL,  
[RecordAxisCondition03] [nvarchar](5) NULL,  
[RecordAxisCondition04] [nvarchar](5) NULL,  
[RecordAxisCondition05] [nvarchar](5) NULL,  
[RecordAxisCondition06] [nvarchar](5) NULL,  
[RecordAxisCondition07] [nvarchar](5) NULL,  
[RecordAxisCondition08] [nvarchar](5) NULL,  
[RecordAxisCondition09] [nvarchar](5) NULL,  
[RecordAxisCondition10] [nvarchar](5) NULL,  
[RecordAxisCondition11] [nvarchar](5) NULL,  
[RecordAxisCondition12] [nvarchar](5) NULL,  
[RecordAxisCondition13] [nvarchar](5) NULL,  
[RecordAxisCondition14] [nvarchar](5) NULL,  
[RecordAxisCondition15] [nvarchar](5) NULL,  
[RecordAxisCondition16] [nvarchar](5) NULL,  
[RecordAxisCondition17] [nvarchar](5) NULL,  
[RecordAxisCondition18] [nvarchar](5) NULL,  
[RecordAxisCondition19] [nvarchar](5) NULL,  
[RecordAxisCondition20] [nvarchar](5) NULL,  
[Race] [nvarchar](2) NULL,  
[BridgedRaceFlag] [nvarchar](1) NULL,  
[RaceImputationFlag] [nvarchar](1) NULL,  
[RaceRecode3] [nvarchar](1) NULL,  
[RaceRecode5] [nvarchar](1) NULL,  
[HispanicOrigin] [nvarchar](3) NULL,  
[HispanicOriginRaceRecode] [nvarchar](3) NULL,  
[RaceRecode40] [nvarchar](2) NULL,  
[OccupationCode] [nvarchar](4) NULL,  
[OccupationRecode] [nvarchar](2) NULL,  
[IndustryCode] [nvarchar](4) NULL,  
[IndustryRecode] [nvarchar](2) NULL  )

The fixed-width files are then loaded into the previously created staging table using a Python program. This program leverages the pandas library to prepare the data for import and the sqlalchemy library to establish a connection to the database, facilitating seamless data handling and loading.

import pandas as pd  
import sqlalchemy

To prepare the data, a list of field names is defined in Python. Each of these field names corresponds to a column in the staging table.

#Lists for parsing fixed width files col_names = ["_1", "RecordType", "ResidentStatus", "_2", "Education1989", "Education2003", "EducationReportingFlag", "MonthOfDeath", "_3", "Sex", "DetailAge","AgeSubstitutionFlag", "AgeRecode52", "AgeRecode27", "AgeRecode12", "InfantAgeRecode22","PlaceofDeathandDecendentsStatus", "MaritalStatus", "DayofWeekofDeath","_4", "CurrentDataYear", "InjuryatWork", "MannerofDeath","MethodofDisposition", "Autopsy", "_5", "ActivityCode","PlaceofInjuryforCauses", "ICDCode", "358CauseRecode", "_6", "113CauseRecode","130InfantCauseRecode", "39CauseRecode", "_7", "NumberofEntityAxisConditions","ConditionPart01", "ConditionSequence01", "ConditionCode01","ConditionPart02", "ConditionSequence02", "ConditionCode02","ConditionPart03", "ConditionSequence03", "ConditionCode03","ConditionPart04", "ConditionSequence04", "ConditionCode04","ConditionPart05", "ConditionSequence05", "ConditionCode05","ConditionPart06", "ConditionSequence06", "ConditionCode06","ConditionPart07", "ConditionSequence07", "ConditionCode07","ConditionPart08", "ConditionSequence08", "ConditionCode08","ConditionPart09", "ConditionSequence09", "ConditionCode09","ConditionPart10", "ConditionSequence10", "ConditionCode10","ConditionPart11", "ConditionSequence11", "ConditionCode11","ConditionPart12", "ConditionSequence12", "ConditionCode12","ConditionPart13", "ConditionSequence13", "ConditionCode13","ConditionPart14", "ConditionSequence14", "ConditionCode14","ConditionPart15", "ConditionSequence15", "ConditionCode15","ConditionPart16", "ConditionSequence16", "ConditionCode16","ConditionPart17", "ConditionSequence17", "ConditionCode17","ConditionPart18", "ConditionSequence18", "ConditionCode18","ConditionPart19", "ConditionSequence19", "ConditionCode19","ConditionPart20", "ConditionSequence20", "ConditionCode20","_8", "NumberofRecordAxisConditions", "_9","RecordAxisCondition01", "RecordAxisCondition02", "RecordAxisCondition03", "RecordAxisCondition04","RecordAxisCondition05", "RecordAxisCondition06", "RecordAxisCondition07", "RecordAxisCondition08", "RecordAxisCondition09", "RecordAxisCondition10", "RecordAxisCondition11", "RecordAxisCondition12","RecordAxisCondition13", "RecordAxisCondition14", "RecordAxisCondition15", "RecordAxisCondition16","RecordAxisCondition17", "RecordAxisCondition18", "RecordAxisCondition19", "RecordAxisCondition20","_10", "Race", "BridgedRaceFlag", "RaceImputationFlag", "RaceRecode3", "RaceRecode5","_11", "HispanicOrigin", "_12", "HispanicOriginRaceRecode", "RaceRecode40","_13", "OccupationCode", "OccupationRecode", "IndustryCode", "IndustryRecode" ]

Field widths for each column in the fixed-width file are then defined in Python:

col_widths = [18, 1, 1, 40, 2, 1, 1, 2, 2, 1, 4, 1, 2, 2, 2, 2, 1, 1, 1, 16, 4, 1, 1, 1, 1, 34, 1, 1, 4, 3, 1, 3, 3, 2, 1, 2, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5,1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 1, 1, 5, 36, 2, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 2, 1, 1, 1, 1, 33, 3, 1, 1, 2, 315, 4, 2, 4, 2 ]

A database connection is subsequently established to enable the data import:

#Database connection variables  server = <Server name in single quotes>
database = <Staging Database Name in single quotes>
username = <User name in single quotes>
password = <password in single quotes>
port = '1433' 
connection_string = 'mssql+pyodbc://'+username+':'+password+'@@'+server+'
	:'+port+'/'+database+'?driver=ODBC+Driver+17+for+SQL+Server'  
#print(connection_string)  
engine = sqlalchemy.create_engine(connection_string)

The pandas’ read_fwf function is utilized to read the fixed-width file into a pandas DataFrame. The col_widths and col_names lists are passed to read_fwf, and any columns beginning with an underscore are excluded.

df=pd.read_fwf("File Path\File Name", widths=col_widths, names=col_names)  
df=df.head(1100)  
df=df.drop(['_1', '_2' , '_3' , '_4' , '_5' , '_6' , '_7' 
	, '_8' , '_9' , '_10' , '_11' , '_12' , '_13' ], axis=1)  
#print(df.head)  
table_name='StagingCDCVitalStatisticsMCOD'

Finally, data from the DataFrame is written to the staging table using the to_sql function.

df.to_sql(name=table_name, con=engine , if_exists='append', index=False)  
print("Program Complete")

With data loaded into the staging table, the next step involves creating a production table to house the finalized dataset and copy the data from the staging table to it. While one staging table is used per data year, only a single production table is created, with the "Source" column distinguishing records by year. Note that the CREATE statement for the production table mirrors that of the staging table, except for addition of Source and RecordCreatedDate fields.

CREATE TABLE [] [dbo].[YourProductionTableName](
[RecordType] [nvarchar](1) NULL,
[ResidentStatus] [nvarchar](1) NULL,
[Education1989] [nvarchar](2) NULL,
[Education2003] [nvarchar](1) NULL,
[EducationReportingFlag] [nvarchar](11) NULL,
[MonthOfDeath] [nvarchar](2) NULL,
[Sex] [nvarchar](11) NULL,
[DetailAge] [nvarchar](4) NULL,
[AgeSubstitutionFlag] [nvarchar](1) NULL,
[AgeRecode52] [nvarchar](2) NULL,
[AgeRecode27] [nvarchar](2) NULL,
[AgeRecode12] [nvarchar](2) NULL,
[InfantAgeRecode22] [nvarchar](2) NULL,
[PlaceofDeathandDecendentsStatus] [nvarchar](1) NULL,
[MaritalStatus] [nvarchar](1) NULL,
[DayofWeekofDeath] [nvarchar](1) NULL,
[CurrentDataYear] [nvarchar](4) NULL,
[InjuryatWork] [nvarchar](1) NULL,
[MannerofDeath] [nvarchar](1) NULL,
[MethodofDisposition] [nvarchar](1) NULL,
[Autopsy] [nvarchar](1) NULL,
[ActivityCode] [nvarchar](1) NULL,
[PlaceofInjuryforCauses] [nvarchar](1) NULL,
[ICDCode] [nvarchar](4) NULL,
[358CauseRecode] [nvarchar](3) NULL,
[113CauseRecode] [nvarchar](3) NULL,
[130InfantCauseRecode] [nvarchar](3) NULL,
[39CauseRecode] [nvarchar](2) NULL,
[NumberofEntityAxisConditions] [nvarchar](2) NULL,
[ConditionPart01] [nvarchar](1) NULL,
[ConditionSequence01] [nvarchar](1) NULL,
[ConditionCode01] [nvarchar](5) NULL,
[ConditionPart02] [nvarchar](1) NULL,
[ConditionSequence02] [nvarchar](1) NULL,
[ConditionCode02] [nvarchar](5) NULL,
[ConditionPart03] [nvarchar](1) NULL,
[ConditionSequence03] [nvarchar](1) NULL,
[ConditionCode03] [nvarchar](5) NULL,
[ConditionPart04] [nvarchar](1) NULL,
[ConditionSequence04] [nvarchar](1) NULL,
[ConditionCode04] [nvarchar](5) NULL,
[ConditionPart05] [nvarchar](1) NULL,
[ConditionSequence05] [nvarchar](1) NULL,
[ConditionCode05] [nvarchar](5) NULL,
[ConditionPart06] [nvarchar](1) NULL,
[ConditionSequence06] [nvarchar](1) NULL,
[ConditionCode06] [nvarchar](5) NULL,
[ConditionPart07] [nvarchar](1) NULL,
[ConditionSequence07] [nvarchar](1) NULL,
[ConditionCode07] [nvarchar](5) NULL,
[ConditionPart08] [nvarchar](1) NULL,
[ConditionSequence08] [nvarchar](1) NULL,
[ConditionCode08] [nvarchar](5) NULL,
[ConditionPart09] [nvarchar](1) NULL,
[ConditionSequence09] [nvarchar](1) NULL,
[ConditionCode09] [nvarchar](5) NULL,
[ConditionPart10] [nvarchar](1) NULL,
[ConditionSequence10] [nvarchar](1) NULL,
[ConditionCode10] [nvarchar](5) NULL,
[ConditionPart11] [nvarchar](1) NULL,
[ConditionSequence11] [nvarchar](1) NULL,
[ConditionCode11] [nvarchar](5) NULL,
[ConditionPart12] [nvarchar](1) NULL,
[ConditionSequence12] [nvarchar](1) NULL,
[ConditionCode12] [nvarchar](5) NULL,
[ConditionPart13] [nvarchar](1) NULL,
[ConditionSequence13] [nvarchar](1) NULL,
[ConditionCode13] [nvarchar](5) NULL,
[ConditionPart14] [nvarchar](1) NULL,
[ConditionSequence14] [nvarchar](1) NULL,
[ConditionCode14] [nvarchar](5) NULL,
[ConditionPart15] [nvarchar](1) NULL,
[ConditionSequence15] [nvarchar](1) NULL,
[ConditionCode15] [nvarchar](5) NULL,
[ConditionPart16] [nvarchar](1) NULL,
[ConditionSequence16] [nvarchar](1) NULL,
[ConditionCode16] [nvarchar](5) NULL,
[ConditionPart17] [nvarchar](1) NULL,
[ConditionSequence17] [nvarchar](1) NULL,
[ConditionCode17] [nvarchar](5) NULL,
[ConditionPart18] [nvarchar](1) NULL,
[ConditionSequence18] [nvarchar](1) NULL,
[ConditionCode18] [nvarchar](5) NULL,
[ConditionPart19] [nvarchar](1) NULL,
[ConditionSequence19] [nvarchar](1) NULL,
[ConditionCode19] [nvarchar](5) NULL,
[ConditionPart20] [nvarchar](1) NULL,
[ConditionSequence20] [nvarchar](1) NULL,
[ConditionCode20] [nvarchar](5) NULL,
[NumberofRecordAxisConditions] [nvarchar](2) NULL,
[RecordAxisCondition01] [nvarchar](5) NULL,
[RecordAxisCondition02] [nvarchar](5) NULL,
[RecordAxisCondition03] [nvarchar](5) NULL,
[RecordAxisCondition04] [nvarchar](5) NULL,
[RecordAxisCondition05] [nvarchar](5) NULL,
[RecordAxisCondition06] [nvarchar](5) NULL,
[RecordAxisCondition07] [nvarchar](5) NULL,
[RecordAxisCondition08] [nvarchar](5) NULL,
[RecordAxisCondition09] [nvarchar](5) NULL,
[RecordAxisCondition10] [nvarchar](5) NULL,
[RecordAxisCondition11] [nvarchar](5) NULL,
[RecordAxisCondition12] [nvarchar](5) NULL,
[RecordAxisCondition13] [nvarchar](5) NULL,
[RecordAxisCondition14] [nvarchar](5) NULL,
[RecordAxisCondition15] [nvarchar](5) NULL,
[RecordAxisCondition16] [nvarchar](5) NULL,
[RecordAxisCondition17] [nvarchar](5) NULL,
[RecordAxisCondition18] [nvarchar](5) NULL,
[RecordAxisCondition19] [nvarchar](5) NULL,
[RecordAxisCondition20] [nvarchar](5) NULL,
[Race] [nvarchar](2) NULL,
[BridgedRaceFlag] [nvarchar](1) NULL,
[RaceImputationFlag] [nvarchar](1) NULL,
[RaceRecode3] [nvarchar](1) NULL,
[RaceRecode5] [nvarchar](1) NULL,
[HispanicOrigin] [nvarchar](3) NULL,
[HispanicOriginRaceRecode] [nvarchar](3) NULL,
[RaceRecode40] [nvarchar](2) NULL,
[OccupationCode] [nvarchar](4) NULL,
[OccupationRecode] [nvarchar](2) NULL,
[IndustryCode] [nvarchar](4) NULL,
[IndustryRecode] [nvarchar](2) NULL,
[Source] [nvarchar](100) NULL,
[RecordCreatedDate] [datetime] NULL
)	

Data is then transferred from the staging table to the production table using the following statement:

INSERT INTO [YourProductionDatabaseName].[dbo].[YourProductionTableName]  
SELECT TRIM(FIELDNAME)…..<list all field names and TRIM to remove any spaces>
FROM [YourStagingDatabaseName].[dbo].[YourStagingTableName]

Upon successful completion of this process, a final consolidated table of CDC Vital Statistics data is established within the relational database, ready for use with data visualization tools or other software applications.

5. DDRC Data Dictionary

The SR-DDRC used information from the Vital Statistics Online Data Portal’s “Public Use Data File Documentation” to create a comprehensive data dictionary for the downloaded and combined files. These “Public Use Data File Documentation” are available for each data year, and include variable definitions and codes values, as well as explain the location of each variable in the fixed width file.

To protect confidentiality, NCHS removes direct identifiers and any characteristics that could lead to identification, including geographic location. Thus, variables listed in “Public Use Data File Documentation” published by CDC NCHS may not be included in the dataset published by CDC NCHS on Vital Statistics Online Data Portal.

For that reason, the SR-DDRC’s data dictionary for the Vital Statistics Mortality Multiple Cause data only features variables that exist in the published dataset. Additionally, due to variations in the datasets, the SR-DDRC’s data dictionary for the Vital Statistics Mortality Multiple Cause files, includes separate sheets for 2010–2017, 2018-2019, 2020-2021, and 2022 to clarify differences in available variables, definitions, and possible values.

We recommend that users still review and reference documentation provided by the CDC NCHS for the Multiple Cause-of-Death Mortality data published on the Vital Statistics Online Data Portal datasets for the most detailed and up-to-date information.

Please refer to the data dictionary for variable definitions and possible values.

If you have additional questions about this dataset, please contact us at info@srddrc.com