Skip to main content

Table 2 Definition of key terms for quality and safety in medical AI research

From: A practical guide to the implementation of AI in orthopaedic research – part 1: opportunities in clinical application and overcoming existing challenges

Term

Definition

Multimodal

In terms of health data, multimodality refers to the many distinct sources of data used by an AI system, such as electronic health records, medical imaging, wearable sensors, patient reported outcome measures, and others

Provenance

Thorough reporting of the origin and analysis of the analyzed data, model preparation, and model validation. The importance of documentation is paramount to ensure error detection, reliability, and reproducibility in AI-based research

Black box decision-making

Certain AI algorithms use methods for decision-making or predictive tasks that are uninterpretable to human observers. Black box models compromise the reliability and transparency of AI systems, as well as the potential for researchers to gain clinically relevant insights from such algorithms

Explainability

The possibility to trace how an AI system reached a conclusion in terms of input variables. Explainability is a key feature for error detection, bias elimination, and building trust in AI systems. Explainability also facilitates the inclusion of clinically relevant variables for model development

Distributional shift

Changes in the characteristics or patterns of the test population and biased training data may lead to decreased accuracy of an AI prediction system, as the model fails to adapt to unfamiliar data

Adversarial example

Data constructed different to the training examples, which may trick AI models to make incorrect predictions and jeopardize the safety of clinical prediction systems

Robustness

The proficiency of an AI system at handing real-world data, with large variations or deviation from the assumptions held by the trained models (missing data, outliers, adversarial examples)

Generalizability

The ability of AI systems to adapt to and correctly interpret data they were not trained on, thereby ensuring the elimination of hidden biases in datasets. Generalizable AI systems deliver consistent performance with patient groups that are adequately represented, as well as those underrepresented in the training data

Reproducibility

The ability of AI systems to produce consistent results when repeatedly trained on the same dataset

Replicability

The ability of AI systems to produce consistent results when repeatedly trained on different datasets

Uncertainty quantification

The process of measuring and determining the magnitude of uncertainty in the results of a predictive model based on input parameters, model characteristics, and inherent biases in the modeled system

Data labelling (annotation)

The task of identifying instances of relevant variables in a given dataset, such as predictors and outcomes, necessary to train models for the assessment of unlabeled test data