l 1.0, indicating no statistically significant difference between the CNNCF results and the expert evaluations. Fig. 5 Boxplots of the F1 score, kappa score, and specificity for the CNNCF and ex