Safety issues have dogged Uber since its early days as a black car-hailing service. Car accidents and physical altercations have persisted despite Uber’s attempts to monitor its cars and vet drivers; reports of sexual violence in its vehicles eventually led Uber to admit it was “not immune” to the problem.
Amid public backlash and calls to address rider safety, Uber rolled out a flashy “Safety First” initiative in 2018, adding features like 911 assistance to its app, tightening screening of drivers, and for the first time, issuing a safety report that outlined traffic fatalities, fatal physical assaults, and sexual assaults on its platform.
That was before the pandemic. Over the past year and a half, safety took on new significance; where it used to mean drivers had to make riders feel taken care of, it now means that in addition to protecting riders from the virus, drivers have to figure out how to keep themselves healthy.
And Uber again found itself persuading riders not to abandon its platform over safety fears — requiring drivers to submit a selfie verifying they’re wearing a mask, offering limited amounts of cleaning supplies, and asking riders to complete a safety check before getting into a vehicle.
While Uber’s changes might ease some riders’ concerns, they don’t offer the same level of automation and scale that an algorithmic solution could. It’s a potential path hinted at by a series of Uber patents, granted from 2019 to last summer, which outline algorithmic scoring and risk prediction systems to help decide who is safe enough to drive for Uber.
Taken together, they point to a pattern of experimentation with algorithmic prediction and driver surveillance in the name of rider safety. Similar to widely criticized algorithms that help price insurance and make decisions on bail, sentencing, and parole, the systems described in the patents would make deeply consequential decisions using digital processes that are difficult or impossible to untangle. While Uber’s quest to make safety programmatic is tantalizing, experts expressed concern that the systems could run afoul of their stated purpose.
An Uber spokesperson wrote in an emailed statement that although the company is “always exploring ways that our technology can help improve the Uber experience,” it does not currently have products tied to the safety scoring and risk assessment patents.
As the battle over drivers’ legal classification continues, after Uber and Lyft used their deep pockets to fund an election victory that let them keep drivers as contractors in California, there are urgent concerns that systems like these could become another means to remove drivers without due process, especially as the pandemic has laid bare the vulnerability of gig workers who lack the safety net employees can lean on.
One patent for scoring driver safety risk relies on machine learning and rider feedback and notably suggests a driver’s “heavy accent” corresponds to “low quality” service.
Another aims to predict safety incidents using machine-learning models that determine the likelihood that a driver will be involved in dangerous driving or interpersonal conflict, utilizing factors like psychometric tests to determine their “trustworthiness,” monitoring their social media networks, and using “official sources” like police reports to overcome biases in rider feedback.
“The mistake is that a driver loses their livelihood, not that someone gets shown the wrong ad.”
Jeremy Gillula, former tech projects director at the Electronic Frontier Foundation who now works as a privacy engineer at Google, said using algorithms to predict a person’s behavior for the purpose of “deciding if they’re going to be a danger or a problem” is deeply concerning.
“Some brilliant engineers realized we can do machine learning based on people’s text, without realizing what we really want to get, and what it actually represents in a real-life application,” he said. “The mistake is that a driver loses their livelihood, not that someone gets shown the wrong ad.”
Surveilling drivers under the guise of safety is a common thread in Uber’s patents. Many evaluate drivers’ performance using information from their phones, including one that scores their driving ability and suggests tracking their eye and head movements with phone cameras, and another that detects their behavioral state (angry, intoxicated, or sleepy) and assigns them an “abnormality score.”
Additional patents aim to monitor drivers’ behavior using in-vehicle cameras and approximate “distraction level” with an activity log that tracks what else they’re doing on their phones; making a call, looking at a map, or even moving the phone around could indicate distraction.
Jamie Williams, a former staff attorney at EFF focused on civil liberties who now works as a product counselor, said drivers should be aware they’re “being watched at all times.”
The patents also mirror technologies recently implemented by Amazon in its delivery vans. The company announced plans in February to install video cameras that use AI to track drivers’ hand movements, driving abilities, and facial expressions. Data collected by the cameras determines a “safety score” and could result in a driver being terminated. Drivers have told Reuters: “The cameras are just another way to control us.”
The algorithm outlined in a 2019 safety risk scoring patent shows how dangerous these systems can be in real life, experts said, noting that it could mimic riders’ existing biases.
The system described in the patent uses a combination of rider feedback and phone metadata to assign drivers a safety score based on how carefully they drive (“vehicle operation”) and how they interact with passengers (“interpersonal behavior”). A driver’s safety score would be calculated once a rider submits a safety report to Uber.
After a report is submitted, according to the patent, it would be processed by algorithms along with any associated metadata, including the driver’s Uber profile, the trip duration, distance traveled, GPS location, and car speed. With that information, the report would be classified into topics like “physical altercation” or “aggressive driving.”
A driver’s overall safety score would be calculated using weighted risk assessment scores from the interpersonal behavior and vehicle operation categories. This overall score would determine if a driver has a low, medium, or high safety risk, and consequently, if they should face disciplinary action. Drivers with a high safety risk might receive a warning in the app, a temporary account suspension, or an unspecified “intervention” in real time.
Adding a further layer of automation, the patent also describes a system that automatically tweaks driver safety scores based on specific metadata. A driver who has completed a certain number of trips would be marked as safer, while one who has generated more safety incidents would be marked as less safe. According to the patent, a driver who works at night is considered less safe than a driver who works during the day.
While this design may seem straightforward — shouldn’t a more experienced driver who has better road visibility be considered safer? — experts say any automated decision-making requires that developers make meaningful choices to avoid inserting bias into the entire system.
Gillula said Uber’s automated rules could make decisions based on flawed human assumptions. “Race may be correlated with what time of day you’re operating as an Uber driver. If it’s a second job because you have to work during the day, it seems ridiculous to penalize you for that,” he said. “This is exactly the sort of thing that worries me.”
If Uber wants to make its algorithmic scoring fair, it would need to be transparent about how drivers are being evaluated and give them a proper feedback channel, Williams said. “Machine learning algorithms can be wrong; users can be wrong,” she said. “It’s very important to have clear processes, transparency, and awareness about what’s going into the score.”
Risk assessment algorithms have long been used by insurance companies to set policyholder premiums based on indicators like age, occupation, geographical location, and hobbies. Algorithms are also utilized in the criminal justice system, where they’re applied at nearly every stage of the legal process to help judges and officials make decisions.
Proprietary algorithms like COMPAS, used in states like Florida and Wisconsin, determine an individual’s risk of recidivism on a scale of 1 to 10, with certain numbers corresponding to low, medium, and high risk — the same rubric Uber’s patent follows.
Though Uber aims to predict “safety” risk in its patents, it faces the same fundamental questions of fairness and accuracy leveled at criminal justice algorithms. (The bias inherent in those algorithms has been pointed out again and again.) If the design of an algorithm is flawed from the outset, its outcomes and predictions will be too. In the criminal justice context, rearrest is an imperfect proxy for recidivism because arrests are so closely tied to factors like where you live, whether you interact with the police, and what you look like, Gillula said.
Uber’s current rating system, which allows riders and drivers to rate one another on a five-star scale, is similar to the interpersonal behavior category described in Uber’s safety risk scoring patent: Both rely on subjective judgments that are the basis for doling out punishments. Under its current system, Uber “deactivates” or fires drivers whose ratings drop below a certain threshold. The policy has long infuriated drivers who say they have no real way of contesting unfair ratings: They’re funneled through a support system that prioritizes passengers and rarely provides a satisfactory resolution.
Bhairavi Desai, executive director of the New York Taxi Workers Alliance, said drivers are not protected from passengers’ racism, bias, or bigotry. “We’ve talked to drivers who feel like they’ve gotten a lower rating because they’re Muslim,” she said. “I know of African American drivers who stopped working for them because they felt that they would be rated lower.”
Former driver Thomas Liu sued Uber last October, proposing a class-action suit on behalf of nonwhite drivers who were fired based on racially “biased” ratings. Williams said the safety score would be subject to the same concerns: “People could put a safety report in just because they don’t like a driver. It could be racially biased, and there could be a lot of misuse of it.”
Varinder Kumar, a former New York City yellow cab driver, was permanently deactivated by Uber in 2019. He’d been driving for Uber every day for nearly five years, and the deactivation meant the sudden loss of $400 to $500 per week.
“You ask them what happened, they always say it’s a safety issue.”
“I went to the office five times, I emailed them, and they said it was because one customer complained,” Kumar said. “Whenever you go there, you ask them what happened, they always say it’s a safety issue. I’ve been driving in New York City since 1991 and had no accident, no ticket, so I don’t know what kind of safety they’re looking for.”
The kind of safety outlined in Uber’s safety risk scoring patent isn’t clear to Kumar either. He said the interpersonal behavior reporting would cause the same problems as Uber’s rating system: “Customers file a complaint even if they are not 100 percent right.” Meanwhile, the vehicle operation category could unfairly penalize New York City drivers who need to drive more aggressively.
Joshua Welter, an organizer with Teamsters 117 and the affiliated Drivers Union, said algorithmic discipline remains a top issue for drivers. “It’s no wonder Uber and Lyft drivers across the country are rising up and taking action for greater fairness and a voice on the job, like due process to appeal deactivations,” Welter said. “It’s about basic respect and being treated as a human being, not a data experiment.”
The basis for Uber’s safety experimentation is user data, and Daniel Kahn Gillmor, a senior staff technologist at the American Civil Liberties Union’s Speech, Privacy, and Technology Project, said Uber is “sitting on an ever-growing pile of information on people who have ever ridden on its platform.”
“This is a company that does massive experimentation and has shown little regard for data privacy,” he added.
In addition to a vast trove of data gathered from over 10 billion trips, Uber collects telematics data from drivers such as their car’s speed, braking, and acceleration using GPS data from their devices. In 2016, it launched a safety device called the Uber Beacon, a color-changing orb that mounts to a car’s windshield. It was announced as a device that assisted with rider pickups, without mention of the fact that it contained sensors for collecting telematics data. In a now-deleted blog post from 2018, Uber engineers touted the Beacon’s benefit as a device solely managed by Uber for testing algorithms and said it collected better data than drivers’ devices.
Brian Green, director of technology ethics at Santa Clara University’s Markkula Center for Applied Ethics, questioned the motives behind Uber’s data collection. “If the purpose of [Uber’s] surveillance system is to promote trust — if a corporation wants to be trustworthy — they have to allow the public to look at them,” he said. “A lot of tech companies are not transparent. They don’t want the light shone on them.”
Welter said that when companies like Uber experiment with worker discipline based on black box algorithms, “both workers and consumers alike should be deeply concerned whether the reach of big data into our daily lives has gone too far.”
In addition to providing a view into Uber’s safety vision, the patents demonstrate the scope of its machine learning ambitions.
“We’re dealing with so many people we don’t know that tech and surveillance steps in to build an artificial trust.”
Uber considers AI essential to its business and has made significant investments in it over the past few years. Its internal machine learning platform helps engineering teams apply AI to optimize trip routes, match drivers and riders, mine insights about drivers, and build more safety features. Uber already uses algorithms to process over 90 percent of its rider feedback (the company said that it receives an immense amount of feedback by design, the majority of which is not related to safety).
Algorithmic safety scoring and risk assessment also fit under Uber’s rider safety initiative and its efforts to ensure safe drop-offs for its growing delivery platform. Experts said the systems are not as far from reality as tech companies’ patents sometimes are. In a statement, Uber said that “patent applications are filed on many ideas, but not all of them actually become products or features.”
But some of Uber’s safety-related patents have close parallels with widely utilized features: A patent filed in 2015 for “trip anomaly” detection bears similarities to Uber’s RideCheck feature, technology for anonymizing pickup and drop-off locations is similar to a patent application filed last year, and an application filed in 2015 for verifying drivers’ identity with selfies is similar to Uber’s security selfie feature.
Green said Uber’s patents reflect a broader trend in which technology is used as a quick fix for deeper societal issues. “We’re dealing with so many people we don’t know that tech and surveillance steps in to build an artificial trust,” he said.
That trust can only extend so far during the pandemic, which has underscored the economic uncertainty drivers face and the limits of technology’s promise of safety. Rolling out such systems now would mean there’s even more at stake.