How Coal Mining Employment Has Impacted on Opioid Epidemic in America

Kan Nishida
learn data science
Published in
9 min readAug 21, 2018

--

Last week, I visualized how the number of mothers who are addicted to opioid while pregnancy is increasing in many US states based on the data recently published by CDC.

For example, here is a chart showing the dramatic increases of the addicted mothers in some states such as Maine, Vermont, West Virginia, and Kentucky, compared to other states.

And here is the map showing the same data.

Now, the obvious question is why?

Why does the situation is getting dramatically worsen especially in these states? What are the unique things happening in these states?

One of the readers told me that this had something to do with Coal Mining. He had actually visited several locations interviewing dozens of people in West Virginia and Kentucky and reported this article for a prominent Japanese news media almost a year ago. And I couldn’t finish reading it without a tear. (sorry, it’s available only in Japanese.)

According to the article, the coal mining work, especially the type of mining work that was common in these states, can damage your back very badly. And the doctors, who were incentivized by the pharmaceutical companies, in those areas prescribed opioid drugs like Oxycodone to the body-aching workers, a lot.

And the situation got worse when the coal mining companies shutdown or downsized their operations in these areas due to the increasing regulatory pressures that were created to promote more use of the clean energy like natural gas rather than coal for generating electricity.

What did the miners, who were addicted to the prescribed opioid drugs by this time, do when they found themselves having lost their jobs and the healthcare benefits? They found the underground opioid drugs that were cheaper, Heroin.

The situations there are just horrible. One school had a third of the children were raised by non-biological parents. Why? Because their real parents were too addicted that they were in no situation they could take care of their kids.

But, there are many states that had (or still have) coal mining operations. Are West Virginia and Kentucky so different from other states? Did they get more impacted by the downsizing of the operations compared to other states?

To answer this question, I found this data set from Mine Safety and Health Administration under the United States Department of Labor. It has the coal mining operator employment numbers by state and by year starting from 2009 to 2015.

Trend of Coal Mining Employment

Here is what I have visualized the trend after a few minutes of data wrangling works.

The decline of the coal mining jobs in West Virginia (top green) and Kentucky (purple) are obvious.

Note that the numbers in the data set represent only the operator employment. This means that they don’t include the contractor employment or any other type of jobs that are related to the coal mining operations. So the impact could be more significant.

Decline of Coal Mining Employment

To understand the decline of the employment better, I have visualized how much the employment has declined comparing to the first year in this data set, that is 2009.

The declines for Kentucky (purple) and West Virginia (Green) are outstanding. They are -9,389 and -6,419 respectively.

Impact of the decline of coal mining employment

Well, each state has a different population. So just comparing the absolute numbers of the employment decline might not be giving us a clear picture.

So I weighed the difference by dividing it by the population in each state to see the impact.

The decreased job numbers in 2015 compared to 2009 represent 0.2% (West Virginia) and 0.3% (Kentucky) of their statewide populations. And the impact can be felt much more in West Virginia and Kentucky.

Again, the data includes only the operator employment, not the contractor employment. Also, it is only the coal miners and doesn’t have any coal mining related industries that provide services to the mining companies or the workers. So the employment impact could be much higher than these numbers.

Where are the coal mining unemployment impacted states?

Here is the map that shows the decreased coal mining jobs ratio.

Those two dark red-ish colored states are West Virginia and Kentucky.

Let’s take a look at the opioid addicted mother chart again.

Now I feel like I know the answer on why West Virginia and Kentucky have increased the numbers so much. I can’t stop thinking that the coal mining employment and its decline had something to do with this horrible opioid epidemic in these states.

But I don’t want to simplify this problem too easily. Correlation doesn’t mean Causation, right?

For example, is the coal mining employment really the direct cause of the rise of the opioid epidemic? Maybe something else had impacted on both the coal mining employment and the opioid epidemic. This is called ‘confounding’ factor.

Also, even if the coal mining employment and its decline had something to do with the opioid epidemic, how does that explain the increases of the mothers who are addicted to opioid during pregnancy?

And, this doesn’t explain the most dramatic increase of the addicted mothers in Vermont and Maine which don’t have any noticeable coal mining operations.

At this point, all I know is this.

These two states are impacted by the decline of the coal mining employment much more than other coal mining states. And these two states had the dramatic increases of the mothers who are addicted to opioids. And many families including the children in these two states (not just these states though) have been suffering.

Data Wrangling with R and Exploratory

I want to share how I prepared the data and visualized it. If you are interested in data wrangling techniques with R or Exploratory please read through.

First, I found this PDF data by Google Search. It is published by Mine Safety and Health Administration under United States Department of Labor, so I thought it was reasonable.

But the problem is the data is inside the PDF.

There are a few ways to extract the data from PDF files. But for this one, since the data I wanted to extract was small and inside one single table I just copied the data and pasted it into Excel.

I had to select ‘Match Destination Formatting’ to paste the data so that one state becomes one row.

There are a bunch of the empty rows. But we can fix this inside Exploratory with R as part of the data wrangling steps.

Data Wrangling

Once the data is in Excel format, then we can import the data into Exploratory.

Remove Empty Rows

There are many rows with NAs, which we want to get rid of. We can simply use Filter step to do this job.

This is equivalent to the below in R.

data %>%
filter(!is.na(State))

Now it looks better.

Convert from Wide to Long Data — Gather / Unpivot

Here is another problem. The year data are presented in the columns. This is called wide data, which spreads out horizontally as more data comes in.

We want to make the data tidy or long format, which makes it easier to visualize and analyze the data in general.

We can use Gather step to bring those columns into a pair of key and value columns.

Select all the year columns with Shift key, then select Gather -> Selected Columns.

In the dialog, we can set the newly created columns’ names.

Once you run it, you will get the data with the two new columns created. One is for the year values and another is for the employment numbers.

This is equivalent to the below in R.

data %>%
filter(!is.na(State)) %>%
gather(year, employment, `CY 2009`, `CY 2010`, `CY 2011`, `CY 2012`, `CY 2013`, `CY 2014`, `CY 2015`, na.rm = TRUE, convert = TRUE)

Extract Numeric Data and Convert to Numeric Data Type

In order to visualize the trend of the employment over the years, ‘year’ column needs to be Numeric data type, not Character.

Typically, I would have used a function like ‘str_remove’ from stringr package like the below to get rid of ‘CY’ letters from the data first.

str_remove(year, "CY ")

But, conveniently enough, there is a function called ‘parse_number’ from the reader package that extracts the numeric part of the data and converts to numeric data type.

Select ‘Change Data Type’ -> ‘Convert to Numeric’ from the column header menu of ‘year’ column.

Now we have ‘year’ column as Numeric data type.

This is equivalent to the below in R.

data %>%
filter(!is.na(State)) %>%
gather(year, employment, `CY 2009`, `CY 2010`, `CY 2011`, `CY 2012`, `CY 2013`, `CY 2014`, `CY 2015`, na.rm = TRUE, convert = TRUE)
mutate(year = parse_number(year))

Convert State Code to State Name

We’re almost ready to visualize the employment trend.

But one last thing.

The State column has the state abbreviation, but it’s hard to understand what it is for most people, especially if you are like me who didn’t grow up in the United States. What is AK or AL?

There is a function called ‘statecode’ from the exploratory package that can convert the state codes to the state names.

Select ‘Replace / Fill / Convert Data’ -> ‘Convert US State’ from the column header menu.

Select ‘name’ for the ‘output’ argument.

This is equivalen to the below in R.

data %>%
filter(!is.na(State)) %>%
gather(year, employment, `CY 2009`, `CY 2010`, `CY 2011`, `CY 2012`, `CY 2013`, `CY 2014`, `CY 2015`, na.rm = TRUE, convert = TRUE)
mutate(year = parse_number(year), State_statecode = statecode(State, output_type = "name"))

Phew, now the data is ready, it’s time to visualize!

Visualize it in Exploratory

Under Viz (Chart) view, select Line chart, assign Year to X-Axis, employment to Y-Axis, and the State name column to Color.

It shows the trend of the number of the operator employment of the coal plants in each state.

Use Window Calculation — Difference From

To show the employment changes better, we can use one of the window calculation methods called ’Difference From. This will simply calculate the difference between the values in the first year, in this case, that is 2009, and the values of each year for each state.

Select ‘Window Calculation’

Select ‘Difference From’ under Calculation Type.

This will get you a chart like the below.

--

--

CEO / Founder at Exploratory(https://exploratory.io/). Having fun analyzing interesting data and learning something new everyday.