Understanding News Polarization

A journey through billions of words

Read the story

Introduction


Democrats and Republicans: light and darkness for some, yin and yang for others.

The bipolarity that opposes Americans is not just a “ballot box thing”, but rather an attitude ingrained in their day-to-day lives. Even the channel they pick when switching on the TV, first thing in the morning, is part of this bigger picture. Does landing on Fox News, CNN, or any other news source, have something to do with all of this?

With the help of data, A LOT OF DATA, we hope to highlight that indeed it does and wish to provide quantitative figures to support this hypothesis.

Dear Data, save us from a biased view...


We, as human beings, need some help to put facts into perspective. Most of the time, our instincts make this task impossible, and the humans that update our worldview don't help, or do they?

We need some metrics to quantify whether the news sources that feed us information are impartial. To do this we define a couple of handy statistics.

Hey CNN, can you tell us something other than politics?

Politicization

Rate at which news sources report quotes by politicians compared to any other speaker.

Hey Fox News, are you able to tell something good about democrats?

Polarization

Rate at which news sources report quotes from a particular political party.

Motivations

“Oh yeah I don’t read those newspapers because they are biased. My favorite news sources aren’t!” - Everyone, 2021

Over the recent years, we’ve been hearing a lot that there’s an increasing amount of politicization and polarization of news sources in the USA, meaning that news feature politicians more often (one well-known politician in particular) and that the articles often favour one side of the debate. During and after the 2016 election, there were many articles about how people backing Trump tended to read different news sources from people on the other side of the political spectrum, news sources that reportedly tended to publish news pieces containing hoaxes and conspiracy theories - for instance, this article. We can still read that some news sources (like the Gateway Pundit) are gaining a huge number of readers by helping Trump spread his claims about the last election being “stolen”. Donald Trump himself has been claiming that many news sources are spreading “fake news” about him, with a disturbing similarity to the “lügenpresse” accusations used by the Third Reich against foreign press. It would appear that Mr. Trump’s popularity mostly relies on specific news sources.

Intuitively, we would expect news sources aligned with Trump to publish articles that cite him and people from his political circle - Republicans, that is - more than people that oppose him, i.e. Democrats.

Let’s have a round of applause for some of the members of the observed team!

CNN

A widely-watched news channel. Has been accused of false balance and being left-biased. May be known for reporting what some people would call a riot as “fiery but peaceful protests” after a police shooting.

Fox News

When people say that right-wingers consume biased news sources, they often think of Fox News. Primarily a news network, Fox News was, for instance, accused of excessively covering scandals concerning Hillary Clinton in order to distract from alleged Russian interference in the 2016 presidential elections in the USA.

Breitbart News

A news network founded by Andrew Breitbart. Wikipedia introduces it by saying that “its journalists are widely considered to be ideologically driven, and much of its content has been called misogynistic, xenophobic, and racist by liberals and traditional conservatives alike”. This is the website often thought of as giving biased coverage to Donald Trump during his presidential campaign, as we can see here.

The Huffington Post

Often mentioned as a left-biased news website. The mortal enemy of Breitbart News, which was created as “The Huffington Post of the right”. Also known as HuffPost.

New York Post

A conservative daily tabloid, known for headlines such as “Headless body in topless bar”. Maybe surprisingly, maybe unsurprisingly, it was reported to be one of the news sources Donald Trump prefers to read.

Our data


We conducted our experiment on 19 news sources, and the following data sources provided the fuel:

This is log-scale, Bob...

Twitter

19 profiles scraped

Pew Research

12.000 US adults interviewed

Wikidata

9.000.000 QIDS inspected.

Quotebank

180.000.000 quotes filtered

Analysis


Note about the plots
All the graphs in this page are interactive. This means that points on the plot can be moused over to display precise information and sometimes, even additional information not shown otherwise (like the number of articles from a source in a particular year). You can also explore the plots and filter what is shown by clicking on the labels on the right-hand side of the plot.

News sources

Previously we introduced some of our news sources and we said that we have some intuitions about the bias of those sources. It turns out that we’re not the only people with intuitions like that. Pew Research did a study of American public asking what opinions did people have about various news sources, and this research shows that our intuitions are actually reflected by what people think.

In the plot below we can see how popular the websites of our news sources are. We can also see what people use the source for their political information, as well as if the source is distrusted by people on the left or on the right.

The following plot is similar, but it displays Twitter followers instead of (non-unique) website views:

Based on this plot, we can already tell that people on the right read different sources than the people on the left. We see that the right-wingers distrust more news sources than the left-wingers; in fact, they distrust most of the sources with mixed readers!

Fun fact: there’s actually one news source which is distrusted by people on the left as well as people on the right. One needs to wonder who exactly reads the Washington Examiner.

Focus on specific politicians

We expected biased news sources to give more space to quotations from Trump. Does our data confirm our suspicions?

We cannot really see a pattern in which news sources quote Trump more. For instance, both Fox News and Breitbart news quoted Trump around 10% of the time in 2018 (which is still a lot, if you think about it!), but various news sources (like The Hill, Politico, ABC News, The Washington Examiner) quoted him up to twice as often.

Interestingly, we do see that many news sources started quoting Trump much more often from 2017 onwards. USA Today is an exception to the this trend - they did quote Trump more, but the increase was much smaller than for the other news sources.

Out of curiosity, we also inspected how often Hillary Clinton, Trump’s opponent from the 2016 presidential election, was quoted by our sources:

Interestingly, a reverse tend appears in this plot. Clinton was quoted about as often as Trump in 2015 and 2016, but after the election year it would appear that our sources lost the interest in her.

Politicization

The following plot displays how politicized our news sources were in past years:

This plot (and all of our plots) is interactive; in particular, years on the right can be clicked and double-clicked to display only some of them. Mousing over the scattered points will display precise politicization at that point, as well as the number of articles from that source in that particular year that our dataset contains.

If we explore this plot, we see that the dataset doesn’t really confirm our intuitions. We expected right-biased sources to be more politicized than others, but most sources appear to have about the same level of politicization. Both Fox News and Breitbart are commonly thought of as right-wing-biased. We may expect that they are publishing more articles quoting politicians (specifically, right-wing politicians) than other sources. However, this is not what we can see based on our dataset. The Huffington Post is often thought of as left-wing-biased (Breitbart News was conceived of as “The Huffington Post of the right”), yet it is also not significantly more politicized than other news sources.

The most politicized news sources are Politico, The Hill and Washington Examiner. Twe first two report on political news, so in hindsight it comes as no surprise that they quotations are mostly attributed to politicians. We do not have a clear reason for Washington Examiner being as politicized as our data indicates. It stands out as the only news source we examine that is distrusted by both the left and the right, but that alone does not explain the politicization.

The least politicized news sources are USA Today and New York Post. USA Today is a middle-market newspaper catering to readers that like both entertainment as well as coverage of important news events, while New York Post is a tabloid.

Polarization

The following plot displays plot displays how polarized our news sourced were:

The polarization of a source is positive if it quoted more right-wing politicians than left-wing ones, negative in the reverse case.

Again, at a glance it seems we do not find what we were looking for in this plot. Breitbart News is as right-polarized as Business Insider, a source with left-wing readership but trusted by both sides of the political spectrum. Similarly, Fox News is as right-polarized as Vox, a website often cited as left-biased and distrusted by the right. The Huffington Post is as left-polarized as ABC News, and The Wall Street Journal, with mixed readers and trusted by both the left and the right, is more left-polarized than either.

However, we have uncovered some strange occurrences in different years. In 2016, the year that Donald Trump was elected, all news sources were left-polarized, which wasn’t the case in 2015. On the other hand, in years 2017 and 2018 (but not 2017!) all news sources were right-polarized.

For completeness, the following plot displays the absolute value of polarization:

What a failure

How is it possible that we could not prove the obvious even with so much data at our fingertips? Some of our fellow academians might have the answer to this question. A research by the Department of Physics and Institute for AI & Fundamental Interactions of the MIT, proposed a machine learning method to identify news sources based on the bias contained in a news piece. In their data pre-processing, they did something that surprised us: they removed quotations of politicians. The reason is that they wanted to extract the opinions of the journalists and, therefore, the media bias. Our approach was diametrically different, since we wanted to compute polarization based on how often a certain political party is quoted. The two researchers clearly explained how the bias of news sources is, by far, identifiable by the words the journalists use in their articles.

There is something else…


Twitter followers
Politicization -0.1726
Polarization sign -0.1933
Polarization -0.1803

Although our principal analysis did not lead to the expected results, we accidentally stumbled into something interesting. As shown in the table above, there seems to be a slight negative correlation between signed polarization and Twitter followers. This may mean that Twitter users are left-wing-biased.

This seems to be a debated topic and has been shown also in the literature. We did not have to go too far from what we had already analyzed, Pew Research themselves conducted a study in 2018 which highlighted the fact that Twitter users are much more representative of the sphere supporting the Democratic Party. 36% identify themselves as democrats, which is a 20% increase of the national, survey-based value of 30%.

Conclusion

In this data story, we reported the results we obtained in trying to calculate the politicization and bias of the American media from the politicians who are most frequently quoted. As explained, we could not extract information that agreed with the actual data found by scientific studies on media bias. That makes sense: for instance, quotes may be used to discredit the other political parties, eventually ruining our analysis. Therefore, what's important is that the context in which the quotations are reported. In fact, it would seem that opposite approaches, like the one carried out by MIT researchers, lead to the desired results.

In conclusion, we believe we achieved a very interesting result. It seems very reasonable to think that just by looking at how often a news source cites a politicians or a political party it is possible to understand its polarization. We proved, with an analysis performed on millions of articles, that this belief is false.
This data story was intentionally informal to simply explain complex topics. The formal analysis with well-documented steps is available here.