Filtering Political Email at Three Email Mailbox Providers:

Is Incoming Political Email Treated Fairly and Equitably by Big Tech?

An Exploration of Potential Biases in Spam Filtering Algorithms (SFAs)

By Andrew Lutts, Founder & CEO, Net Atlantic, Inc.

Preface

Large technology companies like Alphabet (Gmail), Microsoft (Outlook), and Yahoo have created highly sophisticated Spam Filtering Algorithms (SFAs) to scan incoming email and provide their mailbox customers with email they want, and remove the email they don’t want.

It’s a very difficult job. In fact, email administrators at these companies have told us personally that proper handling of incoming email is probably the single largest technical challenge for companies offering free email mailboxes.

Compound this with the fact that internet users worldwide send over 300 billion emails every day to each other and you can see that this is no small task. (source: earthweb.com)

Email Marketer Experiences

In speaking with clients over many years through various political campaign seasons, we’ve often heard a common lament: “Gmail, Hotmail, Yahoo, etc. just don’t like my type of political email. They’re treating my email unfairly!"

Are they right? Is there truth to this? Are some places friendlier than others to certain kinds of political email?

According to a March 31, 2022 10-page study published by four students in the Computer Science Department at North Carolina State University, the simple answer is Yes! (1)



Viewing the summary data in chart format, it becomes readily apparent that Gmail has the most inequitable treatment, and thus the highest bias. Outlook has the most aggressive treatment and moves almost all political (both left and right) email to spam. Yahoo overall marks on average 32% of political email as spam, and is the least restrictive of the three mailbox providers.

Before we review the student’s findings in detail, we first want to discuss their methodology. Overall, we feel they did a very good job attempting to be neutral and free from prejudice in their process. They designed their study carefully, knowing that it is almost impossible not to have some flaws in the processes of collecting and interpreting their data. This was a fairly large study. The students implemented many best practices to help provide for statistically-accurate data collection. See the link to the study and data at the end of this article for the full report with data. In summation, here is their process.

Methodology Used in the North Caroline State University Study

According to the study: “We used three email services, Gmail, Outlook, and Yahoo, and created 102 email accounts, 34 on each of the three services. To accurately estimate the political biases and mitigate the potential effects of demographic factors such as ethnicity and age, we created our email accounts with different combinations of these factors. As email services do not explicitly collect ethnicity information, we assigned a different name to each email account that we randomly picked from a database of common names associated with White, African-American, Hispanic, Asian, and South Asian ethnicities. For age, we assigned each email account to one of the three age groups of 18-40, 40-65, and 65+. Finally, we randomly assigned male and female genders to the email accounts.”

The study covered the 2020 US election over a period of 5 months from July 1, 2020 to November 30, 2020. The students used the 102 email accounts and subscribed to 2 Presidential, 78 Senate, and 156 House candidates email lists. The team completed 24,072 automated email newsletter subscription requests. Over the 5-month study, they collected 318,108 emails over the three mailbox providers.

The students performed a Baseline Experiment as part one of their study where there was no user action. The second part of the study was an Interaction Experiment, which tested to see if by interacting with the email a user could “train” the Spam Filtering Agent on their preferences. Mailbox providers have claimed that users can train the service to learn which kinds of email the users prefer to receive, and which kinds of email the users do not.

Conclusions by the North Carolina State University Team

All Spam Filtering Algorithms exhibited political biases in the months leading up to the 2020 US elections.

(The) political affiliation of the sender plays a role in getting an email marked as spam.

In terms of favoritism, Gmail leaned towards the left (Democrats) whereas Outlook and Yahoo leaned towards the right (Republicans).

The percentage of emails marked by Gmail as spam from the right-wing candidates grew steadily as the election date approached.

Although user interactions (such as moving an email from the spam folder to the inbox, or marking a message as spam) did see some adaptation and learning by the services (Gmail was the best “learner”), this adaptation does not necessarily eliminate the political bias.

Demographic factors, including age, ethnicity, and gender, (of the email user) did not influence the political bias of Spam Filtering Algorithms.

Arguably, there is also this possibility that the Spam Filtering Algorithms (SFAs) of email services learnt from the choices of some subscribers (other subscribers on the network) marking certain campaign emails as spam and started marking those/similar campaign emails as spam for other subscribers.

While we have no reason to believe that there were deliberate attempts from these email services to create these biases to influence voters, the fact remains there that their SFAs have learnt to mark more emails from one political affiliation as spam compared to the other.

These biases may have an unignorable impact on the outcomes of an election, especially with the ability in influence undecided voters.

Discussion and Analysis from Net Atlantic

Of course these three different mailbox providers attract different customers. According to a poll by techjury, people aged 18 – 29 choose Gmail 61% of the time. People aged 65 and over choose Gmail 24% of the time. The average age of a Gmail user is 31, and 68% of their users are between the ages of 18 and 34. (2)

Put simply, Gmail attracts a younger, modern demographic, which can be seen as being more left-leaning and liberal. Younger generations like Millenials and Generation Xers often lean left. Older generations like Baby Boomers and the Silent Generation generally lean to the right. (Pew Research) (3)

Knowing this, it becomes a bit easier to see the correlation between age and political preference as to how Spam Filtering Algorithms execute their filtering. That said, what about causation?

Interestingly, when the user interaction during the experiment moved all email from inbox to spam, Gmail learned from this interaction and showed somewhat less bias. For left-leaning email, the change in email marked as spam at Gmail went from 8.2% before the interaction to 54.2% after. For right-leaning email, the change in email marked as spam went from 67.6% before the interaction to 83.9%. However, although the bias reduced, it was still statistically significant.

So for the most part, user interactions can work to help influence and train your spam filtering agent. If you interact with your email by marking items as spam and not spam, the bias will be reduced to some degree.

What You Can Do

Although an email sender may interpret these biases and challenges as an excuse for thinking that things are simply stacked against them, it's just not true. In fact, senders of all kinds can and do meet with success and deliver well to the large mailbox providers.

The way to success involves a defined plan and checklist of best practices. Some of these are technical, others are more process-oriented. Additionally, take a strategic approach to your messaging content by being careful with your mix of content and calls to action.

Some common mistakes we've seen include working with dirty email lists, old lists, compiled lists, voter registration lists, friend's lists, partner lists, and other poor quality lists. Infrequent and occasional sending is also problematic. Consistent email sending (daily, weekly, etc.) and steady email volumes is important for best results.

Generally speaking what matters most with any email mailing list is the level of permission granted to you by the subscriber. In other words, did they ask to receive your email? Will they be surprised if they receive your email?

We've seen political senders assume that as long as the mailing list is on the same side of being liberal or conservative, it's okay to send to them. Unfortunately, that's just not the case. Again, it all comes down to the level of permission.

Much can be accomplished by optimizing those subscribers who are actively engaging with your email by opening it, clicking on it, and forwarding it. Because these mailbox providers host your email on their servers, they can and do track the way every reader interacts with your email. It tells them a lot.

One important metric used by all mailbox providers is the level of engagement. Put simply, highly engaged email gets preferential treatment, as it is deemed to be far more interesting to readers by the filters. That's why savvy senders deploy tactics that encourage engagement (clicks, forwards, shares, and "read more" links).

Most free email mailbox providers scan and measure hundreds of data points of incoming emails to determine where to place email messages (inbox, promotions, bulk sender, spam folder). A few easy things every email marketer can check are as follows: make sure your mailfrom address is from your domain name (and not a Gmail, Hotmail, Yahoo, address). Make sure the domain name you are using in your email message has been around for a least one year, hopefully several years. Make sure your syntax, HTML code, spelling, and links are perfect and not malformed. Make sure your email is sent in responsive format (can be read easily on all devices), and send it in multipart format (html and text format).

This is just the start. There are many, many, more checks and processes that can be done. Five main areas to work on include your subscriber list(s) quality, audience segmentation, your messaging content, your engagement / calls to action, and your sending reputation. Start with the low hanging fruit first, and work up to more strategic and technical solutions next. The improvements we've seen after changes are made can be dramatic.

Conclusions

We feel that the North Caroline State University Students did an admirable job proving what many of us have suspected for years: bias occurs in the handling of incoming email to these free email mailbox accounts.

And although the bias occurs, it can be harder to determine the causation of the bias. Does the bias occur because other mailbox customers on the same network influence what kind of email you get? It certainly seems so. Or are the biases created and influenced by those people who programmed the Spam Filtering Algorithms? Quite possibly.

It can be difficult to show causality with these figures, and there are many influencing factors which all contribute to how the algorithms execute their functions.

That said, an equitable email delivery process is critically important, as free nations rely on fair and open communications to inform, advocate, evaluate, and vote for elected officials in our democratic process.

Sources:

(1) A Peek into the Political Biases in Email Spam Filtering Algorithms During US Election 2020 by Hassan Iqbal, Usman Mahmood Khan, Hassan Ali Khan, Muhammad Shahzad. Department of Computer Science, North Carolina State University, Raleigh, North Carolina, USA
https://arxiv.org/pdf/2203.16743v1.pdf

(2) techjury: 52 Gmail Statistics https://techjury.net/blog/gmail-statistics

(3) Pew Research: https://www.pewresearch.org/fact-tank/2017/03/20/a-wider-partisan-and-ideological-gap-between-younger-older-generations/
https://www.statista.com/chart/9937/the-gmail-yahoo-mail-age-divide/


About Net Atlantic, Inc. - Net Atlantic is a pioneer and industry leader in the field of email marketing. Since 1995, Net Atlantic, Inc. has been assisting companies, organizations, political parties, special interest groups, and candidates create, send, deliver, and track email campaigns. These efforts help clients fuel growth, generate response, advocate positions, and drive revenue. The company helps clients achieve high levels of success in the areas of politics, advocacy, fund-raising, publishing, messaging, membership development, lead generation, and much more. The company works with their clients to help them stand out from others by establishing leadership in messaging. Learn more at: https://www.netatlantic.com.

 

-->