Reliability of ultrasonography measurement of the anterior talofibular ligament (ATFL) length in healthy subjects (in vivo), based on examiner experience and patient positioning

Background The most common cause of ankle injury is the supination trauma, inflicting a partial or complete rupture of the anterior talofibular ligament (ATFL). Among conventional diagnostic tools and procedures of sports injuries, the method of stress-ultrasonography is reportedly a promising diagnostic tool for examining injuries of the lateral ligaments of the ankle. Preceding studies predominantly examined the comparability of stress-ultrasonography and other established diagnostic tools in terms of efficacy, viability and quality. The purpose of this study was to assess the reliability of stress-ultrasonography of the ATFL based on varying examiner experience and patient positioning. Method Sixteen healthy subjects were examined by four examiners with differing levels of skill and experience in ultrasonography, ranging from laymen to specialist. Measurements were recorded and interrater correlation coefficient (ICC) was applied in four positions, including a neutral position (A), medial rotation (B), plantar flexion (C) and inversion of the foot (D). Results The length of the ATFL was 14.958 ± 2.145 mm in position A, 15.886 ± 1.994 mm in position B, 16.270 ± 1.858 mm in position C and 15.170 ± 1.781 mm in position D. The average length change was 0.928 ± 0.804 mm (6.656 ± 6.299%) in position B, 1.313 ± 1.266 mm (9.746 ± 9.484%) in position C and 0.213 ± 1.807 mm (2.604 ± 12.308%) in position D. The correlation of the combined results of all four investigators was 0.333 for position A, 0.386 for position B, 0.320 for position C and 0.517 for position D. The highest ICC (0.811) was recorded between the orthopedic specialist and the radiology specialist. The lowest ICC (0.299) was recorded between the laymen and the radiology specialist. Conclusion The reliability of the ATFL examination seems to be exceedingly dependent on the examiner’s experience and skill in ultrasonographic (US) diagnostic. Moreover, the inversion positioning of the foot, described by the European Society of Musculoskeletal Radiology (ESSR) yielded the highest measurement reliability.


Background
Injuries to the ligamentous structures of the ankle joint belong to the most prevalent injuries in sports, with the anterior talofibular ligament (ATFL) being afflicted in 65% of the cases in consequence of supination trauma (Roos et al., 2017). Caused by inadequate injury management as well as repeated injury, 10-20% of ankle injuries lead to chronic instability of the joint (Walther et al., 2013). In order to prevent long-term complications of injury to the ATFL, such as chronic instability and osteoarthritis, therapeutic planning is vital and ought to commence swiftly and precise (Kerkhoffs et al., 2002;Delahunt et al., 2018).
Standard diagnostic methods of lateral ligament injury of the ankle described in the literature include clinical examination, X-ray, magnetic resonance imaging (MRI) and arthrometer stress testing. However, refutations to aforementioned diagnostic tools include mediocre reproducibility, financial-and time expenditure, and in the case of X-ray, health exposure (Kerkhoffs et al., 2002). Previously indicated by various authors, ultrasonographic (US) stress-test of the ligament apparatus of the talocrural joint is a viable supplement to other imaging diagnostics and clinical examination, due to the cost efficient, noninvasive and timesaving nature (Friedrich et al., 1990). Furthermore, preliminary findings underline the accuracy of US stress-testing, compared to abovementioned diagnostic tools (Cheng et al., 2014). The available results on this topic stem from different levels of professional examiners with the majority comprised of professional sonographers.
As of current, no statement has been made, as to whether, and to what extent examiner experience effects the reliability of US ATFL-examination. To take advantage of the effectiveness of this diagnostic tool and to administer adequate therapeutic management, evidencebased guidance in forecasting reliability based on the examining physician's skill, would be advantageous. Therefore, we intend to assess the reliability of stressultrasonography measurement of the ATFL, based on varying examiner experience and skill levels in four patient positions, using the model of interrater reliability. We hypothesize that increasing degree of experience and skill is positively correlated with good/excellent interrater reliability.

Patient selection
Between November 2017 and February 2018, a total of 16 subjects were selected for US examination of ATFL, consisting of 11 males and 5 females with the median age of 25 years, ranging from 19 to 55 years. Each pupil was required to complete a written questionnaire, including age, height, athletic activity and injury history involving acute injury or chronic instability of the ankle. Pupils without acute injury or prior injury history of at least one of their ankles, were included into the study. In the case of acute injury or chronic instability of one and the same ankle in the pupil's history, the healthy ankle was examined. If the subject reported no history of ankle injuries or chronic instability in either ankle, the joint to be examined was selected at random.

Investigators
The US evaluation of the ATFL of each subject was conducted by four respective male investigators with distinct skill and experience in US diagnostics (Table 1). Investigator #1 (laymen) was a student of the University of Applied Sciences FH Technikum Vienna, with no prior experience or skill in the field of musculoskeletal US diagnostics. Prior to this investigation, he received basic introduction in US diagnostic of the ATFL, consisting of 40 h hands-on tutorial training. Investigator #2 (medical student) was a fourth-year medical student of the Medical University of Vienna with basic US knowledge and training, imparted by medical school curriculum. Prior to this investigation, he had 1.5 years of professional experience in US musculoskeletal diagnostics. Investigator #3 (orthopedic specialist) was a specialist in sports orthopedics and orthopedic surgery with 20 years of experience in US diagnostics of the musculoskeletal apparatus. Investigator #4 (radiology specialist) was a specialist in radiology with a focus on musculoskeletal imaging, who had 15 years of experience in US imaging of the musculoskeletal apparatus.

Hardware2
All examinations were performed with a NextGen LOGIQ e Ultrasound console, operating a highfrequency linear array L8-18i-RS stick ultrasound probe by GE Healthcare (Company GE, 2014)Wauwatosa, Wisconsin, United States of America). The transducer possesses a footprint of 11.1 × 34.8 mm with a bandwidth of 6.7-18.0 MHz imaging frequency (General Electric Company, 2014). All examinations were performed at 18.0 MHz frequency.

Transducer handling
To visualize the ATFL, the transducer was placed in the transversal plane, perpendicular to the subjects' longitudinal axis. With the bony palpable malleolus lateralis as reference point, the transducer was mounted one cm proximal to the most distal palpable bony part of the fibula, projecting the left-aligned, convex, echogenic outline of the lateral malleolus on the imaging unit. Subsequently, employing this fixpoint, the transducer was slightly rotated radially along the longitudinal axis of the foot towards the talus, until its outline could be identified as an ascending echogenic line, opposing the fibula. This resulted in the sectional image, in which the lateral malleolus, talus and bridging ATFL were distinguished (Figs. 1,2,3,4).

Data collection
The length of the ATFL was appraised with the measure-utility of the LOGIQ e ultrasound console. To ensure reproducibility of the measurements, a linear extent between the lateral malleolus and the foremost point of the lateral joint surface of the talus, corresponding to the bony attachment sites of the ligament, were elected as described by Croy et al (Croy et al., 2012) After full visualization of ATFL width, the protocol was to elect the center of each ligament insertion point on the US cross-section ( Fig. 1).

Patient positioning
The measurements were conducted with the subjects in four distinctive positions. Each investigator autonomously visualized and measured the ATFL of all 16 subjects in each of the four widely used positions, three times per position to produce an average length per examiner per position. Between measurements, the subjects loosened their lower extremity to obviate probable distortions of range of motion by developing muscle tension.
In the first position (position A), the subject was seated on the examination table, with the calf of the designated lower extremity resting across the examiners knee and the ankle suspended in slight (10-20 degrees) plantarflexion. This position, also described by Cho, et al. 2015, was rendered the neutral resting position as baseline value for each subsequent measurement of the ATFL (Fig. 2).
To assume position B, the subject's ankle was passively rotated medially by the investigator. The subject was seated on the examination table, with the calf of the designated lower extremity resting across the examiners knee and the ankle suspended. The measurement of the ATFL was conducted at maximal internal rotation stress of the talocrural joint ( Fig.3) (Cho et al., 2016).
With the subject's calf remaining on the examiners knee, position C was assumed through maximal plantarflexion of the ankle by the examiner (Fig. 4).
Position B & C can also be exhibited via widely used clinical examination protocols (DeLee & Miller, 2018;Campbell et al., 2017;Thompson & Miller, 2015). As demarcated standard patient positioning of the ATFL per European Society of Musculoskeletal Radiology (ESSR), position D required the patient sitting on the examination table with the knee bent 45 degrees and the sole of the foot placed flat on the examination table (Beggs et al., 2010;Lee & Yun, 2017). Next, the foot was placed in maximal inversion, so as to tense the lateral ligaments (Fig. 5).

Data processing
To examine the reproducibility, interrater correlation of the investigators' results for each position (A, B, C, D) was analyzed by means of Interclass Correlation Coefficient (ICC). As each measurement was carried out by deliberately selected investigators beforehand, the model with two-way mixed average measures was applied. For the statistical analysis, the software SPSS Statistics 25 by IBM (Armonk, New York, United States of America) and Microsoft Excel (Microsoft Corporation, Redmond, Washington, United States of America) were utilized. The absolute accordance of the measurement results was examined, whereby the ICC for each investigator per subject and position was determined, for establishing the grand average. ICC values at a standard confidence interval of 95% were evaluated via interrater-agreement measures by model of Cicchetti (Cicchetti, 1981). To compare the mean change in length of the ATFL, the length of the ATFL at rest (position A) was first subtracted from the results of the stress tests (Position B, C and D). Subsequently, mean values and standard deviations of the length changes (Δl) were calculated for each position and then compared.

Statistical analysis
For statistical computation and evaluation, we used the software SPSS 20 (IBM, USA). The correlation between the results of all four investigators was 0.333 for position A, 0.386 for position B, 0.320 for position C and 0.517 for position D. The highest correlation, independent from investigator, was 0.517 in position D.
Examining the ICC of investigators, regardless of position, the ICC was highest between investigator #3 and investigator #4 with 0.811, followed by 0.524 between investigator #2 and investigator #3. The lowest ICC was recorded between measurement results of investigator #1 and investigator #4, marking 0.299 ( Table 2).

Discussion
The most important finding of this study is that reliability of US measurement is dependent on examiner skill and experience. In the hand of an experienced examiner, , investigated the quality of US examinations in the diagnosis of chronic ATFL injuries compared to arthroscopic findings and their surgery reports, respectively, with significant results in sensitivity, specificity and accuracy (Cheng et al., 2014). Lee, et al. (2012) and Cho, et al. (2016), compared the US stress-test in patients with chronic instability to the clinical stress-test, the radiographic stress-test, MRI and arthroscopy, defining US stress-testing as a viable additional diagnostic tool (Lee et al., 2014). Gün, et al. (2013), established that there is no significant difference in the diagnostic accuracy of diagnostic ultrasonography and MRI (Gun et al., 2013). Most recently, Lee (2017) and Cao (2018) conclude that point of care US, while as precise as MRI, may be cost saving in patient management and increasing quality of care, while it also is more accurate in chronic ankle instability (Lee & Yun, 2017;Cao et al., 2018;Radwan et al., 2016). Given this compelling evidence, one can safely argue that US imaging is a valuable diagnostic tool for physicians visualizing the lateral ankle ligaments of patients with supination trauma or chronic ankle instability. Our study is the first investigation to scrutinize the relationship of measurement results and examiner skill and experience. Furthermore, Furthermore, we are the first group to determine one position superior among 4 common patient positions.
Observing the ICC between the individual examiners, it appears that the reliability of the results of the laymen is poor at ICC of 0.390, 0.355 and 0.299, compared to the medical student, orthopedic specialist and radiology specialist, respectively. Without prior knowledge and skill in US imaging and merely basic introduction to hardware handling and patient positioning, the laymen failed to produce reliable measurements of the ATFL. The orthopedic specialist and radiology specialist presented an ICC of 0.811 (> 0.75), scoring the sole excellent correlation as per Cicchetti, 1981(Cicchetti, 1981. One can argue that their proximity of measurements reflects their reliability of diagnostic imaging by means of their long pedigree in US imaging. The ICC between the medical student and the laymen and radiology specialist indicated poor correlation. However, the medical student recorded sufficient correlation (0.524) with the orthopedic specialist. As measurements of the medical student showed higher correlation to the orthopedic specialist's results than the measurements of the laymen, one can reason that the medical student's basic skill and 1.5 years of professional experience with US imaging of the musculoskeletal apparatus, reflect the increasing reliability of his US measurements. Despite sufficient correlation with the orthopedic specialist, results of the medical student were barely commensurable with the results of the radiology specialist, charting at an ICC of 0.390. This may demonstrate the persistent gap in knowledge, experience and skill between the two investigators. Due to the sensitivity of the ultrasound system, approximate reliability presupposes abundant practice. Exhibited by the results of this study, the medical student may have capitalized from basic US knowledge and skill imparted by his medical school curriculum, as well as 1.5 years of professional experience in US imaging, suggested in the sufficient reliability in marked contrast to the laymen. Per indication of increased diagnostic reliability with training and experience, the investment in training of ultrasonography evaluation in early medical career ought to be sustained and developed.
With regard to the examination position and ICC, a poor overall reliability (< 0.4, by Cicchetti, 1994) of the examination results was found in position A, B and C. One possible justification for this could be the 10-20 degrees of plantarflexion in the resting position (position A) specified by Cho, et al. (2016), providing a 10-degree margin of fluctuating levels of tension of the ATFL (Cho et al., 2016). Factors determining the accuracy of measurement in position B and C accumulate to investigator induced variation. In this position of medial rotation as well as plantarflexion, the examiner had to passively stress the talocrural joint. Therefore, variation of subjective joint loading via investigator force and grip cannot be discounted. Sufficient reliability (0.4-0-59, by Cicchetti 1994) could be observed in position D, the standard position described by the European Society of Musculoskeletal Radiology (ESSR). As the subject remained seated on the examination table with the determined leg flexed at 45 degrees in the knee and the foot placed flat on the table, the investigator's task during this examination was reduced to accurate placement of the transducer with little degree of freedom to manipulate the joint angle. This slim margin for variation can be illustrated with the highest ICC (0.517) out of all four positions.
The highest elongation recorded among all positions, compared to the resting position, occurred in maximal plantarflexion. At an average length of 16.270 ± 1.858 and an absolute change in length of 1313 ± 1.266 mm, this position also exhibited the largest change in aforementioned values among each investigator. As briefly discussed above, measurement in this position arguably yielded elevated variability, born of various investigator grip and force, in addition to ankle soft tissue composition and individual ankle range of motion (Nigg et al., 1990).
One of the limitations of this study is the sample consisting of healthy pupils without injury history. Therefore, the significant results of this investigation are confined to the measurement reliability of US examination, influenced by investigator experience and skill, rather than diagnostic reliability. Therefore, considerations to substantiate clinical relevance of this investigation should include studies examining patients with acute and/or chronic ligament injury. As injury to the ATFL can present with varying injury patterns, increased sample size with analogous injury classifications could be planned. Nevertheless, as evidence proposes (Croy et al., 2013;Mizrahi et al., 2018;Bai et al., 2013) the length of the ATFL does in fact positively correlate with ligament injury, proving the US length measurement to be a practical diagnostic tool.

Conclusion
Ultrasonographic imaging is a valuable diagnostic tool for physicians visualizing the lateral ankle ligaments as it reflects increasing reliability of measurements with examiner experience and skill. This suggests that investment in training of ultrasonography evaluation in early medical career ought to be sustained and developed. The patient positioning described by the European Authors' contributions K-HK has made substantial contributions to conception and design, acquisition of data, analysis and interpretation of data; has been revising the manuscript critically for important intellectual content; has given final approval of the version to be published; has participated sufficiently in the work to take public responsibility for appropriate portions of the content; has agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. JSA: has made substantial contributions to conception and design, acquisition of data, analysis and interpretation of data; has been involved in drafting the manuscript and revising it critically for important intellectual content; has given final approval of the version to be published; has participated sufficiently in the work to take public responsibility for appropriate portions of the content; has agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. JW has made substantial contributions to conception and design, acquisition of data, analysis and interpretation of data; has been revising the manuscript critically for important intellectual content; has given final approval of the version to be published; has participated sufficiently in the work to take public responsibility for appropriate portions of the content; has agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. FH: has made substantial contributions to conception and design, acquisition of data, analysis and interpretation of data; has been revising the manuscript critically for important intellectual content; has given final approval of the version to be published; has participated sufficiently in the work to take public responsibility for appropriate portions of the content; has agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. HP: has made substantial contributions to conception and design, acquisition of data, analysis and interpretation of data; has been revising the manuscript critically for important intellectual content; has given final approval of the version to be published; has participated sufficiently in the work to take public responsibility for appropriate portions of the content; has agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.