A practical guide to the implementation of AI in orthopaedic research – part 1: opportunities in clinical application and overcoming existing challenges

Zsidai, Bálint; Hilkert, Ann-Sophie; Kaarre, Janina; Narup, Eric; Senorski, Eric Hamrin; Grassi, Alberto; Ley, Christophe; Longo, Umile Giuseppe; Herbst, Elmar; Hirschmann, Michael T.; Kopf, Sebastian; Seil, Romain; Tischer, Thomas; Samuelsson, Kristian; Feldt, Robert

doi:10.1186/s40634-023-00683-z

Journal of Experimental Orthopaedics

Table 2 Definition of key terms for quality and safety in medical AI research

From: A practical guide to the implementation of AI in orthopaedic research – part 1: opportunities in clinical application and overcoming existing challenges

Term	Definition
Multimodal	In terms of health data, multimodality refers to the many distinct sources of data used by an AI system, such as electronic health records, medical imaging, wearable sensors, patient reported outcome measures, and others
Provenance	Thorough reporting of the origin and analysis of the analyzed data, model preparation, and model validation. The importance of documentation is paramount to ensure error detection, reliability, and reproducibility in AI-based research
Black box decision-making	Certain AI algorithms use methods for decision-making or predictive tasks that are uninterpretable to human observers. Black box models compromise the reliability and transparency of AI systems, as well as the potential for researchers to gain clinically relevant insights from such algorithms
Explainability	The possibility to trace how an AI system reached a conclusion in terms of input variables. Explainability is a key feature for error detection, bias elimination, and building trust in AI systems. Explainability also facilitates the inclusion of clinically relevant variables for model development
Distributional shift	Changes in the characteristics or patterns of the test population and biased training data may lead to decreased accuracy of an AI prediction system, as the model fails to adapt to unfamiliar data
Adversarial example	Data constructed different to the training examples, which may trick AI models to make incorrect predictions and jeopardize the safety of clinical prediction systems
Robustness	The proficiency of an AI system at handing real-world data, with large variations or deviation from the assumptions held by the trained models (missing data, outliers, adversarial examples)
Generalizability	The ability of AI systems to adapt to and correctly interpret data they were not trained on, thereby ensuring the elimination of hidden biases in datasets. Generalizable AI systems deliver consistent performance with patient groups that are adequately represented, as well as those underrepresented in the training data
Reproducibility	The ability of AI systems to produce consistent results when repeatedly trained on the same dataset
Replicability	The ability of AI systems to produce consistent results when repeatedly trained on different datasets
Uncertainty quantification	The process of measuring and determining the magnitude of uncertainty in the results of a predictive model based on input parameters, model characteristics, and inherent biases in the modeled system
Data labelling (annotation)	The task of identifying instances of relevant variables in a given dataset, such as predictors and outcomes, necessary to train models for the assessment of unlabeled test data

Back to article page