Students and colleagues will certainly remember my harsh criticism of a not-anymore-so-novel ultrasonic scaler, PerioScan, of which its manufacturer claims that it might be able to detect remaining subgingival calculus. According to the manufacturer Sirona,
“PerioScan is an ultrasonic unit with treatment and diagnosis all-in-one, thereby offering new dimensions [sic] to periodontology. The unit detects and removes calculus by using a gentle method of treatment preventing the accumulation which causes periodontitis.
The tooth surfaces are being analysed by the touch of the ultrasonic tip (scaler) on the basis of the physical oscillation pattern. Indicator for the presence of calculus on the root surface is the blue LED lighting integrated into the tip of the handpiece. The LED colour changes to green when it contacts healthy surface. Two containers allow the use of different irrigating liquids during treatment.
The user interface is large and clear so that the user can have a glance at the data and the settings during the treatment at all times. The design impression and the colour concept of the interface are a reference for present and future projects of the entire product range of Sirona.”
The rationale for the devolpment of a smart device for both detecting (based on “fuzzy-pattern recognition”) and removing calculus (by conventional ultrasound) has certainly been recognition of a common disadvantage of any ultrasonic scaling device: its lack of tactile control of whether the entire subgingival root surface has actually been machined. I had used this device for years as a striking example for lack of clinical evidence after 20 years of development, apart from a 2008 pilot study (which actually tested validity of calculus assessment after extraction of the teeth), that it may actually lead to better results as regards the clinical response to subgingival scaling which might justify rather high acquisition costs. But there is more which should be used for teaching undergraduates.
A recent clinical study by Corraini and Lopez (2014) claims that “the reliability of subgingival calculus recordings using the ultrasound technology is reasonable [sic].” The authors performed calculus assessments at an impressive 22,584 sites in 147 adult periodontitis patients twice within, on average, 9 days. Authors report 65% agreement of assessment.
Well, when it comes to reliability, let’s look at its usual measure, Cohen’s kappa. Table 2 of the article (see below) provides results of calculus detection at baseline and retest. Thirty-five percent sites were definitely misclassified at one or the other time of examination. That validity of the measurement has not and cannot be determined with certainty in a clinical setting (which likely adds considerably more misclassification) was not discussed in the paper. So, any therapist has to take false negative and false positive signals (green LED or acoustic alert) into account and consequently hardly knows anymore when scaling can be finished.
The observed agreement in the study by Corraini and Lopez, Po, was (8,010 + 6,619)/22,584 = 0.648. Based on the Table’s marginal proportions, however, one had to expect agreement of Pe = (12,001*11,974 + 10,583*10,610) 22,5842 = 0.502. Kappa (i.e. agreement beyond chance) is then (Po–Pe)/1-Pe = (0.648-0.502)/(1-0.502) = 0.293. Whether that is “reasonable” (apparently a new category for assessing reliability) or suffices for a new and expensive device may lie in the eye of the beholder, though. Some would actually suggest “fair” reliability. Others would find it awfully low. Reliability was tested in untreated patients. What would be of much greater importance is, of course, whether reliability would differ after treatment.
At least, authors do briefly mention wide variation of kappa among patients, between 0.03 [sic] and 0.52. As if kappa of 0.03 in a certain patient would not demand a further close analysis of striking unreliability in this particular case (and less than fair reliability in others). Authors do point out (Fig. 2 in the article) “a U shape [of the distribution of the proportion of observed agreement of patients] such that agreement was highest when the extent of subgingival calculus was either low or high, and lowest with a medium extent of subgingival calculus,” a common observation when sampling repeat assessments for conducting a reliability study. Exactly for that reason the expected agreement should be considered as well. But why has the overall low kappa been concealed if not because of conflict of interest? Authors concede that the “device, its handpieces and inserts were borrowed [sic] from Sirona Dental Systems during the study period. All results belong to the authors according to a transfer agreement between Aarhus University and Sirona Dental Systems.” That would not prevent reporting bias, of course.
Regardless the fact that calculus is not the cause of periodontal disease (rather vital bacteria) and may reasonably be considered a sequel of the disease process (a common misconception which has to be corrected when teaching undergraduate students in considerable detail); and removing just calculus would not resolve the problems with bacterial colonization of root surfaces free of calculus; and despite the fact that in the definition of one particular form of periodontitis it is mentioned, as a secondary criterion “the amount of microbial deposits [including dental calculus] [may be] inconsistent with the severity of periodontal destrcution,” low reliability of calculus detection by the PerioScan device should discourage the general dental practitioner or dental hygienist to invest in the expensive smart device.
Table 1 in the article indicates that few sites had moderately deep pockets of 4-6 mm (11%) and very few were deeper sites (2%). It is in these sites where a therapist would expect subgingival calculus, not really in the vast majority of 87% sites with probing depths of 0 [sic] to 3 mm where calculus, if it occurred, would probably be even visible. If authors had confined their analysis to moderately deep and deep pockets, would reliability had been higher? Well, it might be so. In Table 3 of the article odds ratios derived from a multilevel logistic regression model of sites with “disagreement” of calculus assessment are presented which appear to provide some evidence for shallow sites being more accountable for disagreement. In particular, as compared to shallow sites, the odds of calculus disagreement was 1/3 for deep sites with probing depths of 7 mm or more. In moderately deep pocktes of 4-6 mm, the odds ratio was 0.74. Since disagreement was found in 35% sites, and 87% were shallow, 11% were moderately deep and 2% deep, these lower odds may in fact be clinically not so relevant. (Similar conclusions might be drawn for furcation involvement which is addressed in Table 4 of the article.)
In conclusion, reliability of calculus assessment as indicated by kappa, be it overall or in certain subjects, was poor or fair at best. “Reasonable” as authors describe reliability of PerioScan is not a category, in particular when taking far-reaching claims by the manufacturer and high costs into account. Authors have probably tested already clinical responses to scaling with the smart device as compared to hand instrumentation. It would be interesting to ultimately learn more about the performance of the device and its efficacy in a clinical experiment.
29 October 2014 @ 6:16 am.
Last modified October 29, 2014.