Avoid Social Media Bias

In so many areas, the potential of the Internet seems subverted by the design decisions made by those who have built businesses on top of what seemed an innovation with so much potential. My focus here is on the political division and animosity that now exists. Since the origin of cable television, we have had a similar issue with an amazing increase in the amount of content, but the division of individuals into tribes that follow different “news” channels that offer predictably slanted accounts of the news of the day to the extent that loyal viewers are often completely unaware of important stories or different interpretations of the events they do encounter. 

The Internet might have seemed a remedy. Social media services are already functioning as an alternative with many now relying on social media for a high proportion of the news individuals encounter. Unfortunately, social media services are designed in ways that make them as biased and perhaps more radicalizing than cable tv news channels. 

Social Media and Internet News Sources

Here is the root of the problem. Both social media platforms and news sources can use your personal history to manipulate what you read. Social media platforms (e.g., Facebook, X, Instagram)  use algorithms that analyze your past behavior, such as the posts you’ve liked, shared, or commented on, as well as the time you spend on certain types of content. They use this information to curate and prioritize content in your feed, including news articles, which they predict will keep you engaged on their platform. You add the way algorithms work on top of the reality that those we follow as “friends” are likely to have similar values and beliefs and what you read is unlikely to challenge personal biases you hold. To reverse the Rolling Stone lyric, you always get what you want and not what you need.

News sources are different from social media in which you identify friends and sources. However, news sources can also tailor their content based on the data they gather from your interactions with their posts or websites. These practices are part of a broader strategy known as targeted or personalized content delivery, which is designed to increase user engagement and, for many platforms, advertising revenue.

Many major news organizations and digital platforms target stories based on user data to personalize the news experience. Here are some examples:

Google News: Google News uses algorithms to personalize news feeds based on the user’s search history, location, and past interactions with Google products. It curates stories that it thinks will be most relevant to you.

Apple News: By using artificial intelligence, Apple News+ offers a personalized user experience. Publishers can adapt content based on readers’ preferences and behavior, leading to stronger engagement and longer reading times.

The New York Times: The New York Times has a recommendation engine that suggests articles based on the user’s reading habits on their website. If you read a lot of technology-related articles, for example, the site will start to show you more content related to technology.

Are Federated Social Media different?

Federated social media refers to a network of independently operated servers (instances) that communicate with each other, allowing users from different instances to interact. The most notable example of a federated social media platform is Mastodon, which operates on the ActivityPub protocol. On Mastodon, you can follow accounts from various instances, including those that post news updates. For example, if a news organization has an account on a Mastodon instance, you can follow that account from your instance, and updates from that news source will appear in your feed. This system allows for a wide range of interactions across different communities and servers, making it possible to follow and receive updates from diverse news sources globally.

Your Mastodon timeline is just a reverse chronological feed of the people you follow, or the posts from people on your instance only (and not across all of Mastodon). There’s no mysterious algorithm optimized for your attention. So, with Mastodon, a news source you follow may have a general bias, but you would get the stories they share without prioritization by an algorithm based on your personal history.. This should generate a broader perspective.

With Mastodon, you can join multiple instances some of which may have a focus. For example, I first joined Maston.Social which at the time was that instance most users were joining. I have since joined a couple of other instances (twit.social & mastadon.education) that have a theme (technology and education), but participants post on all kinds of topics. An interesting characteristic of federated services is that you can follow individuals from other instances – e.g., you can follow me by adding @grabe@twit.social from other instances.

This brings me to a way to generate a news feed the posts from which will not be ordered based on a record of your personal use of that instance. Many news organizations have content shared through Mastodon and you can follow this content no matter the Mastodon instance you join. Some examples follow, but you can search for others through any Mastodon account. You follow these sources in the same way you would follow an individual on another account. 

@npr@mstdn.social

@newyorktimes@press.coop

@cnn@press.coop

@wsj@press.coop

@bbc@mastodon.bot

@Reuters@press.coop

Full access may depend on subscriptions. For example,  I have a subscription for the NYT.

So, if a more balanced feed of news stories appeals to you. Try joining a Mastodon instance and then follow a couple of these news sources.

Loading

Arguing against big data a convenient way to argue for an increased bottom line?

It seems to be Apple and Microsoft against Google. The “concern” expressed by the unlikely pairing of Apple and Microsoft is that Google collects, and as I understand all of the concerns, shares personal data for pay. Google, in response, argues they use the information collected as a way to improve the search experience.

While this sounds like an disagreement over a principle, the positions taken align with business interests. Google makes money from advertising. Apple makes money from hardware. Microsoft makes money from software. The income foci of these companies has evolved and this may have something to do with the positions now taken on privacy. Google offers software and services for free partly to increase use of the web and as a way to offer more ads and collect more data. Google also offers services that decrease the importance of hardware. Chrome hurts the hardware sales of Apple.

What I think is important under these circumstances is clear public understanding of what data are being collected, how it is being used, and what are the motives of the players involved. It turns out we all are also players because blocking ads while still accepting services (the consequences of modifications of browsers) involves personal decisions for what constitutes ethical behavior.

Into this business struggle and how it has been spun appears a recent “study”  from Tim Wu and colleagues. Evaluation of the study is complicated by the funding source – Yelp. Yelp has long argued their results should appear higher in Google searches and suggests Google elevates the results of Google services instead. Clearly, you or I could go directly to Yelp when searching for local information completely ignoring Google (this is what I do when searching for restaurants), but Yelp wants more.

I have a very small stake in Google ads (making probably $3-4 a year), but I am more interested in the research methodology employed in this case. My own background as an educational researcher involved the reading and evaluation of many research studies. Experience as an educational researcher is relevant here because many educational studies are conducted in the field rather than the laboratory and this work does not allow the tight controls required for simple interpretation. We are used to evaluating “methods” and the capacity of methods to rule out alternative explanations. Sometimes, multiple interpretations are possible and it is important to recognize these cases.

Take a look at the “methods” section from the study linked above. It is a little difficult to follow, but it seems the study contrasts two sets of search results.

The Method and the data:
The method involved a comparison of “search results” consisting of a) Google organic search results or b) Google organiic search results and Google local “OneBox” links (7 links for local services with additional information provided by Google). The “concern” here is that condition “b” contains results that benefit Google.

The results found that condition B generate fewer clicks.

Here is a local search showing both the OneBox results (red box) and organic results from a Minneapolis search I conducted for pizza. What you see is what I could see on my Air. Additional content could be scrolled up.

gsearch

The conclusion:

The results demonstrate that consumers vastly prefer the second version of universal search. Stated differently, consumers prefer, in effective, competitive results, as scored by Google’s own search engine, than results chosen by Google. This leads to the conclusion that Google is degrading its own search results by excluding its competitors at the expense of its users. The fact that Google’s own algorithm would provide better results suggests that Google is making a strategic choice to display their own content, rather than choosing results that consumers would prefer.

Issues I see:

The limited range of searches in the study. While relevant to the Yelp question which has a business model focused on local services, do the findings generalize to other types of search?

What does the difference in click frequency mean? Does the difference indicate as the conclusion claims that the search results provide an inferior experience for the user? Are there other interpretations. For example, the Google “get lucky” and the general logic of Google search is that many clicks indicate an inferior algorithm. Is it possible the position of the OneBox rather than the information returned that is the issue? This might be a bias, but the quality of the organic search would not be the issue.

How would this method feed into resolution of the larger question (is the collection of personal information to be avoided)? This connection to me is unclear. Google could base search on data points that are not personal (page rank). A comparison of search results based on page rank vs. page rank and personal search history would be more useful, but that is not what we have here.

How would you conduct a study to evaluate the “quality” concern?

Wired

Search Engine Land

Fortune

Time

Loading