You are currently viewing A Simple Thumbs Up or Down Eliminates Racial Bias in Online Ratings‌‌
image

On a scale of one to five, how was your omelet? Your car rental experience? Your dentist, your doctor? How was your water bottle purchase? Your shoe purchase? Your microwave purchase and your home purchase? ‌

“These ratings are not only ubiquitous but take many forms,” says Prof. Tristan Botelho. But he has doubts that evaluators use different rating scales in a consistent way. “You have to do a lot of work to convince me that if different people look at a given scale, such as five stars or one to ten, that there is a common understanding of what those different numbers represent.”‌

People grade with different levels of harshness. People focus on different aspects of an experience or product. And, most concerning, people may infuse subtle preferences into their ratings.‌

It’s this last possibility that Botelho, Sora Jun of Rice University, and Demetrius Humes and Katherine DeCelles of the University of Toronto examine in a study recently published in Nature. In partnership with an online gig-work platform, they found that a five-star rating system leads to systematic differences between non-White and White workers; in turn, this depresses the average income of non-White workers. When the platform shifted to a two-point scale—thumbs up or down—this bias disappeared.‌

The core of the study is a quasi-natural experiment. The online platform, which connects customers to small business entrepreneurs in the home services industry, abruptly changed the way customers evaluate workers from a five-star scale to thumbs up vs. thumbs down.‌

Botelho and his colleagues looked at nearly 70,000 customer ratings—55,000 before the change and 15,000 after. They found that when customers used a five-star scale, racial minorities received an average of 4.72 stars and White workers received an average of 4.79 stars. While these differences appear small, they aren’t; because worker wages are tied to their rating, the difference meant that, on average, non-White workers earned 91 cents for every dollar White workers earned. But shifting to a two-point scale eliminated the disparity in ratings and earnings. ‌

Using a finer gradation sometimes gives the illusion of precision. But with thumbs up or down, people need to focus on the question before them: Was the problem fixed? Was the food good? Was the quality good or bad?

With a set of three complementary online survey experiments, the researchers found that subtle—and potentially subconscious—biases appear to be driving this effect. A five-point scale allows for inconspicuous downgrading: somebody might give a White worker five stars and a non-White worker of equal quality four stars without being aware of the bias at work, while two-point scales don’t easily allow for such hedging.‌

“Using a finer gradation sometimes gives the illusion of precision,” Botelho says. “But a dichotomous rating, with thumbs up or down, means people need to focus on the question before them: Was the problem fixed? Was the food good? Was the quality good or bad?”‌

Could dropping to a two-point rating scale lead to rating inflation? Or would customers and companies lose some of the nuance that can be conveyed through a more expansive ratings range? ‌

Inflation, he notes, is already a well-known problem. In the case that he studied, workers had average ratings of above 4.7; over 85% of ratings were five stars. Consumers can rationalize downgrading a five-star rating to a four as still giving a “good rating,” but the distinction between these two ratings gives space for bias to bleed in.‌

Botelho also suggests that concern about how much information a rating contains when a larger scale is used may be overstated. Most customers and managers simply want a yes or no answer when they are looking at a rating; they want to know if a job was done effectively, or if a product functions as advertised. ‌

For companies, a two-point system makes it easier to separate good from bad quality when assessing workers or products. It can be difficult to interpret what individuals mean when they give workers three- or four-star ratings. “But if a worker gets a downvote, then you know immediately which stakeholder to talk to and which worker to follow up with,” Botelho says.‌

That said, Botelho is aware that both customers and companies sometimes want more information than a single binary response provides, for example when considering employee reviews or hiring decisions. In these cases, companies could “use a dichotomous system but try involving more categories,” he says. A manager, for instance, could give a thumbs up or down on work quality, collaboration, timeliness, reliability, problem solving, and so on.‌

“In this context, the simple change to an up vote versus down vote was able to eliminate race-based differences in ratings and pay. And in the case of the company we were working with, they received no customer complaints, no functionality loss, no effect on business operations…and they created a fairer playing field,” Botelho says. “At the end of the day, we want evaluation processes to work, to be accurate and fair, and this simple fix helped make that happen.”‌

The Yale School of Management is the graduate business school of Yale University, a private research university in New Haven, Connecticut.”

Please visit the firm link to site


You can also contribute and send us your Article.


Interested in more? Learn below.