Cutting corners: Machine learning models can learn to cheat, too

Published: February 12, 2024

Author: Jaimie Patterson

Category:

Research

Deep learning, a subset of machine learning based on artificial neural networks, has the potential to transform health care through advanced computer vision models, which are theoretically capable of helping human doctors detect and diagnose disease. However, for safe deployment in the real world, machine learning models must be resistant to fluctuations in data and provide reliable results, according to experts.

Recently introduced legislation would require machine-learning-based automated decision systems to undergo rigorous algorithmic auditing and impact assessment before going live. In support of this mandate, researchers from Johns Hopkins and the FDA propose a new approach: Instead of just examining the models, audit the data they’re trained on first to effectively eliminate hidden biases.

According to the researchers, ML models are notoriously good at creating shortcuts based on their training data—which may result in inaccurate conclusions. To combat this phenomenon, the team developed a novel technique for the rigorous screening of medical image datasets to identify features or characteristics that may result in so-called “shortcut learning.” They presented their new approach at the 26^th International Conference on Medical Image Computing and Computer-Assisted Intervention last year.

“Shortcut learning occurs when a model tries to improve its performance by learning to rely on a feature that doesn’t align with the intentions of its human designers,” explains co-first author Mitchell Pavlak, a doctoral candidate in the Whiting School of Engineering’s Department of Computer Science and a member of the Advanced Robotics and Computationally AugmenteD Environments (ARCADE) Lab. He is advised by Mathias Unberath, an assistant professor of computer science and the principal investigator of the ARCADE Lab.

For example, because images of malignant skin lesions often include some measurement to indicate lesion size, an ML model may assume that the presence of a ruler in a picture of a patient’s skin means that the patient’s lesion is definitely malignant, not benign. The opposite goes for medical drains in chest X-rays—if there isn’t a drain in the image, a model may incorrectly assume that the patient in question doesn’t have a medical condition associated with the use of that drain, even if that condition is what they’re currently being screened for.

To anticipate—and hopefully avoid—these shortcuts, the JHU-FDA team audited the data used to train such models, allowing them to determine which features downstream models are most likely to use in shortcuts later on.

“By doing this, we can efficiently guide audits for any model trained on the same data and preemptively identify dataset-level flaws,” says Pavlak.

While previous studies have focused on identifying biases of individual ML models, the JHU-FDA team considered whether they could detect such biases in image datasets themselves. The researchers broke down the risk of bias from a particular attribute into two components: utility, or how useful that attribute would be for completing the task if the model knew it beforehand, and detectability, or how easy it is for a model to correctly extract the attribute from an image.

They found that these measurements of utility and detectability were highly predictive of which features downstream models end up relying on, allowing the team to better prioritize their auditing efforts. The resultant screening method can be used in conjunction with existing techniques to efficiently identify and minimize shortcut learning in preexisting models, they say.

To prove that their method reliably identifies nearly imperceptible bias-inducing artifacts, the team conducted a case study using an existing dermoscopic image dataset. By systematically introducing various synthetic biases to the dataset, they were able to evaluate their method’s sensitivity to extremely subtle changes in the data. For example, they found that something as simple as the JPEG quality a person chooses when saving an image can result in detectable shortcuts later on.

“Our most interesting finding—as measured by our screening method—was that downstream deep learning models trained on this flawed dataset are very likely to become biased by the source of the data,” Pavlak says. “Specifically, knowing which camera was used to capture the image conveys substantial information as to whether or not a patient had a malignant skin lesion. This relationship is clearly problematic, as it can lead to inflated model performance metrics and is unlikely to hold true in the real world.”

By pinpointing the most likely causes of model failures before they’re trained on potentially biased data, the research team expects that their screening method will empower researchers to perform more systematic algorithmic audits. They also hope their work will guide future data collection efforts in the pursuit of the development of safer and more reliable AI models.

“Our proposed method marks a positive step forward in anticipating and detecting unwanted bias in machine learning models,” says Pavlak. “By focusing on dataset screening, we aim to prevent downstream models from inheriting biases that are already present and exploitable in data.”

Additional authors of this work include co-first author Nathan Drenkow, a doctoral candidate in the department, a member of the ARCADE Lab, and a senior research scientist at the Johns Hopkins Applied Physics Laboratory; Nicholas Petrick, the deputy director of the Division of Imaging Diagnostics and Software Reliability in the Office of Science and Engineering Laboratories at the U.S. Food and Drug Administration’s Center for Devices and Radiological Health; and Mohammad Medhi Farhangi, a staff fellow at the FDA CDRH.

Cutting corners: Machine learning models can learn to cheat, too

Stay Connected

Address

Contact

Site Menu

Share Options

Site Menu