Understanding What Makes Californians United or Divided based on Election Data

Kan Nishida
learn data science
Published in
9 min readFeb 24, 2017

--

In the previous post, I have calculated similarities among California counties based on how they voted for the last election, clustered them into 3 groups, and visualized them on Scatter chart like below in Exploratory.

The counties in Orange cluster are mostly located near Pacific ocean and are generally considered as ‘liberal’ counties. One county pulling this cluster towards to the left is San Francisco.

The counties in Blue are mostly in Sierra mountains area that is at north east side of California and they are considered as ‘conservative’ counties in general. Lassen and Modoc counties are the ones pulling this cluster towards to the right.

The counties in Green are mostly in Central and South part of California, and the county pulling this cluster towards to the top is Imperial.

Now the question is, what makes the counties within same clusters similar to one another? Or, what makes each cluster different to one another? And, that’s what I’m going to try finding out today using R and Exploratory.

I’m going to use mostly visualization to explore the data and find the answers, but I will also use ‘loess (Local Polynomial Regression Fitting)’ algorithm in R to draw trend lines to investigate if there are correlations between some of the ballot measures, for example, Legalizing Marijuana vs. Tougher Gun Control.

State Ballot Measures Data

Here is how the first 10 rows of the California Ballot Measures election result data look like.

We can see there are 59 counties and 17 ballot measures under Summary view.

If interested, you can check this page at California Secretary of State website to see all the 17 State Ballot Measures for the details.

Compare Counties by Lining Them up on 17 Measures

To attempt to find what makes the counties similar to one another or different, I’ve mapped the measure names at X-axis and the ‘Yes Ratio’ (Ratio of voters who voted Yes) to Y-axis on Scatter chart like below. I’m trying to show each county name along the dots.

But this is hard to read especially in the middle, so I’ve kept the text labels only for the counties that are either in the first 5 percentile or the last 5 percentile in each of the measures by running the following command.

group_by(BALLOT_MEASURE_TITLE) %>%
mutate(extreme_county_name = if_else(between(yes_ratio,quantile(yes_ratio,0.05),quantile(yes_ratio,0.95)),"", COUNTY_NAME))

Or, in Exploratory it looks like this.

This would give us something like this.

By the way, the color indicates whether the counties supported a given measure (Green) or not (Red).

When we look at this carefully, we would notice that San Francisco and Lassen / Modoc counties are almost always at the opposite ends of the spectrums for almost any measures.

Can they ever be able to agree on anything? Sure, there is one ballot the two counties agreed on, that is “Adult Film Condom Requirement”.

Interesting enough, this is where Imperial county is almost single handedly supporting the measure. I guess there are many Porn people living in the county?

Is San Francisco Monolithic City?

And did you notice something else? If you look through all the measures, San Francisco is almost always showing up at the very end of the spectrums in either way (top or bottom). This is ironic, because if any county (or city) we think as a model for diversity would be San Francisco (maybe Los Angeles and others, too), but when it comes to voting for the ballot measures, it’s not really diverse at all. There is almost no middle ground for this city, which means one side of the views is way more influential than the other.

What makes Republican and Democratic counties similar or different?

We can assign the party name (Republican or Democratic) to Color to see if that has any correlation to how they voted for each measure.

As expected, we can see the divide between the Blue (Democratic) counties and Red (Republican) counties on almost any measure. However, there is no clear divide on some of the measures. We can actually see this better with Boxplot chart like below.

We can see that some part of the Blue (Democratic) and Red (Republican) counties boxes are in the same range on Y-axis (Yes Ratio) for some measures like Adult Porn Condom Requirement and Marijuana Legalization. But when it comes to Firearms and Ammunition Sales and Cigarette Tax, there is a clear divide along with the party line.

Compare Counties by Two Ballot Measures

Now, let’s take a look at this even closer to find how the counties are similar or different along with those dividing measures like Firearms and Ammunition Sales, Cigarette Tax, etc. To do this, we want to transform the data by making each measure a column so that we can assign them at X and Y axes on Scatter chart separately.

We can select BALLOT_MEASURE_TITLE and yes_ratio columns with Command (Mac) or Control Key (Windows) and select Spread operation from the menu.

This will generate an R command like below.

spread(BALLOT_MEASURE_TITLE, yes_ratio)

After running the command, we would see 17 new columns each of which represent each measure with Yes Ratio values for each county.

I’ll pick the following pairs of the measures and see how counties have voted along those measures.

  • Firearms and Ammunition Sales vs. Cigarette Tax
  • Adult Film Condom Requirements vs. Firearms and Ammunition Sales
  • Ban on Single-use Plastic Bags vs. Charge on Carryout Bag
  • Adult Film Condom Requirements vs. Marijuana Legalization

Firearms and Ammunition Sales and Cigarette Tax

We can start with these two measures that have separated Republican and Democratic counties clearly as we have seen above.

As you can see, there is almost a linear relationship between these two measures, which means the more you support tougher controls on owning guns the harder you want to make to buy cigarettes. And as expected, San Francisco is at the end of the spectrums for supporting both measures and Lassen / Modoc are at the opposite end.

The circles (Red and Blue) on the chart indicate the area for the counties of each party. When you see a bigger circle it means that the counties within a given party are more diverse about a given pair of the measures. So on this chart, we can see Democratic counties (Blue) are more diverse on these two measures. Some of the counties have supported both of the measures with about 70% to 80% Yes Ratio while some others are with 40% to 50% Yes Ratio, which means they actually opposed them.

Adult Film Condom Requirements vs. Firearms and Ammunition Sales

Now let’s bring Adult Film Condom Requirement measure, which makes San Francisco and Lassen / Modoc similar as we have seen before, and compare it against Firearms and Ammunition Sales measure.

There is no linear relationship between the two measures. It seems to me that the people who mildly support the tougher gun control (between 50% and 60% range) tend to care more about the porn stars. Anyway, I guess, a topic of adult films is not politically divided. 😉

Ban on Single-use Plastic Bags or Charge on Carryout Bags?

These two measures sounds similar, both of them are talking about the bags you get at the cash register at grocery stores. So you would think this should have a linear correlation between the two measures, which means the people who want to ban on the single-use plastic bags should also want to charge on the carryout bags to encourage brining their own bags, right?

Wrong! 😱 Here’s the chart.

When you look at especially the Democratic counties, they are widely spread out on the spectrum of ‘Ban on Single-use Plastic Bags’ measure ranging from 40% to 80% while the spread is much narrower on ‘Carryout Bag Charges’ spectrum ranging only between 40% to 50% of Yes Rate, which by the way means they opposed it.

But what’s more interesting here is, if we exclude ‘very liberal’ counties like San Francisco, Santa Cruz, Marin, Alameda, Sonoma, Mendocino, and Humboldt, then the line could be more linear indicating that the more you support ‘Ban on Single-use Plastic Bags’ the more you support ‘Charge on Carryout Bags’.

So what is going on here? Why these ‘very liberal’ counties and many other counties were against charging on Plastic Bag while some Silicon Valley counties like San Mateo, Santa Clara actually supported it?

I looked up the details about ‘Carryout Bag Charges’ measure on Secretary of State web site’s proposition 65 page. And here is what the summary says.

Redirects money collected by grocery and certain other retail stores through mandated sale of carryout bags. Requires stores to deposit bag sale proceeds into a special fund to support specified environmental projects. Fiscal Impact: Potential state revenue of several tens of millions of dollars annually under certain circumstances, with the monies used to support certain environmental programs.

It sounds good, right? But then there is ‘argument’ section that tells us about the pros and cons of this measure.

Pro

YES ON 65 — PROTECT THE ENVIRONMENT. In a deal brokered by special interest lobbyists, the Legislature REQUIRED grocery stores to CHARGE and KEEP fees on certain bags at checkout. Grocers get $300 million richer, while shoppers lose $300 million. Prop. 65 redirects those fees to environmental projects, not grocer profits.

Con

Prop. 65 is sponsored by out–of–state plastic companies from South Carolina and Texas. They don’t care about California’s environment, they just want to confuse voters and distract from the real issue: the need to phase out plastic grocery bags. 65 is deceptive and doesn’t deserve your vote.

Yep, now it all makes sense why the ‘very liberal’ counties voted against it. Because they would rather want to eliminate the plastic bags entirely instead of helping the companies behind the plastic bag keep making money by selling shit load of the bags!

I wonder how many of the people in counties like San Mateo, Santa Clara were aware of this when they voted… 🤔

Adult Film Condom Requirements vs. Marijuana Legalization

And here is the last one. Two measures that it’s easier for Blue and Red counties to come together. Here, we can see that two circles are on top of each other, which means either being Republican or Democratic doesn’t make much of the difference on these two measures. Well, unless you are from one of the 3 axes of the extreme counties, San Francisco, Lassen / Modoc, and Imperial. 😉

That’s all. This is the last episode of this series. Thanks for reading!

I have a lot better understanding of California counties after exploring the last election result. And I hope you have that as well. It’s a big state, the size is almost same as the whole country of Japan or UK (including only the main islands), with population of 39 millions, with GDP of $2.5 trillions, which makes it the 5th largest economy in the world only after US, China, Japan, Germany, and UK. It has a huge diversity of culture, economy, politics, landscape and nature, that makes this place so unique, beautiful, and the land of opportunity.

Love California!

I have shared the data along with the steps to reproduce here. You can download the data for importing into your Exploratory Desktop to reproduce the visualization or explore the data further.

If you don’t have Exploratory Desktop yet, you can sign up for a free trial or it’s free for Students, Teachers, Journalists, and Nonprofits.

--

--

CEO / Founder at Exploratory(https://exploratory.io/). Having fun analyzing interesting data and learning something new everyday.