Accidental drug deaths have become a growing public health concern in recent times. The need for effective prevention and control measures necessitates comprehensive data analysis and insights. The case study presents an analysis of accidental drug deaths data in Connecticut from 2012 to 2021, utilizing R programming language and its libraries, including dplyr and ggplot2, to extract valuable insights through data manipulation techniques.
The study aims to tidy the raw data obtained from the open data sources and perform data manipulation to make it fit for analysis and further insights. The insights from the study aims to aid healthcare professionals in identifying trends and patterns in drug deaths among diverse demographics, particularly the vulnerability of certain genders and age groups to drug-related deaths, and the specific drugs involved. Additionally, the findings of the study can be used to highlight the heightened vulnerability of specific groups to drug-related deaths, with some drugs more commonly involved than others. Furthermore, the study insights are particularly useful in identifying certain geographic hot-spots for drug deaths in Connecticut, emphasizing the importance of targeted interventions to prevent accidental drug deaths.Through this study, we focus on the organization and preparation of data on accidental drug-related deaths in the state of Connecticut to demonstrate the importance of data transformations in data analysis and to illustrate how doing so can lead to more meaningful insights and conclusions.
Keywords: Accidental drug deaths, Prevention, Data analysis, Data manipulation
Drug overdose deaths have been a growing problem in the United States in recent years, and accurate data analysis is essential for developing effective prevention and control measures. This literature review provides an overview of the significance of data manipulation methods in deriving meaningful insights from intricate and large data sets, with a particular focus on accidental drug deaths.
Data manipulation and tidying are essential parts of the data analysis process, as they help transform raw data into a format that is more easily understandable and useful for decision-making. This process involves tasks such as cleaning, transforming, and restructuring data to identify trends and patterns that can provide valuable insights into drug overdose deaths. Several studies have examined drug deaths data in Connecticut and the insights that can be derived from manipulating and tidying this data. One study analyzed data from the Centers for Disease Control and Prevention (CDC) on opioid overdose deaths between 2000 and 2015. This study aimed to investigate the racial and ethnic differences in opioid overdose deaths in the United States. The researchers found that while all racial and ethnic groups experienced an increase in opioid overdose deaths during this period, the rate of increase was highest among non-Hispanic whites. The study also found that the rate of opioid overdose deaths was significantly higher among non-Hispanic whites and American Indian/Alaska Natives compared to other racial and ethnic groups. The findings suggest that targeted interventions are needed to address the racial and ethnic disparities in opioid overdose deaths in the United States.
The insights derived from data manipulation and tidying in these studies have important policy implications. Policymakers and public health authorities can use this information to develop targeted prevention and control measures that address the specific needs of high-risk groups. These measures can include initiatives such as public education campaigns, improved access to addiction treatment services, and stricter regulations on prescription drug use.
The data set Accidental Drug Related Deaths from 2012-2021 in Connecticut has been derived from Connecticut Open Data Repository (Connecticut Data, 2021). Drug overdose is one of the leading causes of injury-related deaths in the U.S (Hedegaard, Minino, & Warner, 2020). An estimated number of 100,000 people have died in between April 2020 to 2021 due to drug overdoses, which was an increase of 28.5% from the previous year according to the CDC (2021).
The data set was collected by the Office of the Chief Medical Examiner in Connecticut through an investigation process that includes a toxicity report, death certificate, and a scene investigation. It includes 48 columns with information such as the number of deaths, demographic information of those who died (such as age, race, and gender), and the location and substances detected in the overdose for 9202 individuals.
# Read data set and check number of rows and columns
<- read_csv('Accidental_Drug_Related_Deaths_2012-2021.csv', show_col_types = FALSE) data
The data set contains the following columns that describe the demographics of the patients such as their age, race, ethnicity, location and the kind of drug toxicity:
# column names
names(data)
## [1] "Date" "Date Type"
## [3] "Age" "Sex"
## [5] "Race" "Ethnicity"
## [7] "Residence City" "Residence County"
## [9] "Residence State" "Injury City"
## [11] "Injury County" "Injury State"
## [13] "Injury Place" "Description of Injury"
## [15] "Death City" "Death County"
## [17] "Death State" "Location"
## [19] "Location if Other" "Cause of Death"
## [21] "Manner of Death" "Other Significant Conditions"
## [23] "Heroin" "Heroin death certificate (DC)"
## [25] "Cocaine" "Fentanyl"
## [27] "Fentanyl Analogue" "Oxycodone"
## [29] "Oxymorphone" "Ethanol"
## [31] "Hydrocodone" "Benzodiazepine"
## [33] "Methadone" "Meth/Amphetamine"
## [35] "Amphet" "Tramad"
## [37] "Hydromorphone" "Morphine (Not Heroin)"
## [39] "Xylazine" "Gabapentin"
## [41] "Opiate NOS" "Heroin/Morph/Codeine"
## [43] "Other Opioid" "Any Opioid"
## [45] "Other" "ResidenceCityGeo"
## [47] "InjuryCityGeo" "DeathCityGeo"
Here is a sample of six rows of the data:
# sample of the data set
head(data)
Table 1.2 : Sample of the data
Accidental drug deaths data is essential for health professionals, particularly those working in public health, epidemiology, substance abuse treatment, and healthcare delivery. This data can provide valuable insights into drug overdose trends, help identify patterns and high-risk populations, and inform public health policies and interventions.The power of data in healthcare cannot be understated, especially when it comes to understanding the extent of drug abuse and overdose within a specific population. With the Accidental Drug Related Deaths data, healthcare professionals have a unique opportunity to gain deep insights into drug-related death trends and patterns that can be used to inform the development of targeted prevention and intervention strategies.
By using data transformation and visualization methods, healthcare professionals can more effectively identify high-risk groups and develop preventative measures such as education programs and harm reduction strategies. The visualizations can help them analyze the data and understand the types of drugs involved, the demographics of the victims, and the geographical distribution of incidents. This information can be leveraged to monitor the prevalence and incidence of drug overdoses in specific populations or geographical areas and inform public health policy and intervention strategies.
The data set obtained is a raw data set, meaning, it might have issues with inconsistencies in the data formatting and the way columns are arranged. We will now illustrate the use of a series of commands in order to make this data set “tidy”. A tidy data set can be used for performing robust analysis and gaining insights from the data which are not biased or error-prone.
Date
column contains values for both
Date of death
and Date reported
, which makes
the data set unclear and untidy. Hence, the columns depicting
Date
and Date Type
need to be changed to
depict Date of death
and Date reported
columns
instead by using a pivoting technique. The pivot_wider()
method should be used to add two columns date of death
and
date reported
on a new data frame
new_data
.# Changing date of death and date reported columns using pivoting
<- data %>%
new_data_reorder pivot_wider(
names_from = `Date Type`,
values_from = c(Date)
)
sample(new_data_reorder[, c('Date of death','Date reported')])