Results The results of the nine individual readers are shown in Table 2. It also shows results obtained by independently combining reader scores with CAD. The mean correct localization fraction of a reader in the false-positive fraction interval ranging from 0 to 0.1 (TFP10) is used as the performance measure. Results show that radiologists did not perform better in this study than the non-radiologists. We computed average LROC curves from all the readers, the non-radiologists, and the radiologists. These are shown in Figs. 2, 3, and 4, respectively. Table 2 Reader detection performance in the false-positive fraction interval ranging from 0 to 0.1 Without CAD TPF10 (%) With CAD TPF10 (%) Independent combination TPF10 (%) Non-radiologists 1 41.1 51.3 43.3 2 35.3 51.5 41.7 3 16.0 25.9 26.3 4 15.4 25.2 27.4 5 18.3 41.9 26.7 Average 25.2 39.2 33.0 Radiologists 6 24.3 32.3 33.6 7 24.8 28.8 30.2 8 30.2 25.7 37.0 9 20.2 30.4 30.0 Average 24.9 29.3 32.7 Reader average 25.1 34.8 32.9 Fig. 2 Average LROC curves obtained from the nine readers for the detection of cancers with and without using CAD. The false-positive fraction interval ranging from 0 to 0.1, where the mean correct localization fraction is computed, is highlighted in light yellow Fig. 3 Average LROC curves obtained from the five non-radiologists Fig. 4 Average LROC curves obtained from the four radiologists The performance of the average reader increased with CAD at low false-positive rates from 25.1% to 34.8%. Every reader improved their performance using CAD with the exception of reader 8. The difference between reading with and without CAD for the average reader, measured by the performance metric defined above, was statistically significant (p = 0.012). Results confirm that performance may also be increased by independent combination with CAD scores, with a smaller increase, however, than obtained with interactive use of CAD. The difference we found between interactive use of CAD and independent combination is not statistically significant. As an example, a mammogram of a woman with an invasive ductal carcinoma is shown in Fig. 5. In this case, seven of the nine readers correctly localized the cancer in both sessions, but rated their finding substantially more suspicious in the session with interactive CAD enabled, one reader only located the cancer correctly in the session where CAD was enabled, and one reader did assign a slightly lower rating to the cancer in the session with CAD. In Fig. 6, the same case is shown with the activated CAD region. The average time to read a case without CAD was 84.7 ± 61.5 s. The radiologists read the cases much faster than the non-radiologists. Average reading time in the session with CAD was 85.9 ± 57.8 s/case (Table 3). There were no significant differences in reading times for the session with CAD and the session without CAD (p = 0.13) (Table 3). Fig. 5 Mediolateral oblique mammographic views of a woman with an invasive ductal carcinoma indicated by the arrow. Seven of the nine readers correctly localized the cancer in both sessions, but rated their finding substantially more suspicious in the session with interactive CAD enabled, one reader only located the cancer correctly in the session where CAD was enabled, and one reader did assign a slightly lower rating to the cancer in the session with CAD Fig. 6 The same case as in Fig. 5 with the activated CAD region. The red contour and a CAD score close to zero indicate a high probability that this is a cancer Table 3 Mammogram reading times Average reading time per case (s) Without CAD With CAD P value Non-radiologists 1 83.6 ± 47.0 111.5 ± 70.3 0.001 2 84.3 ± 59.2 67.7 ± 42.1 0.03 3 131.1 ± 65.1 129.5 ± 56.9 0.51 4 158.8 ± 68.1 146.0 ± 62.3 0.23 5 33.4 ± 29.6 35.2 ± 29.0 0.45 Average 97.0 ± 70.0 96.7 ± 67.4 0.97 Radiologists 6 63.1 ± 45.6 58.9 ± 37.8 0.57 7 57.8 ± 31.7 70.8 ± 44.6 0.002 8 73.1 ± 44.1 73.1 ± 31.4 0.42 9 86.7 ± 52.1 88.6 ± 39.1 0.12 Average 70.0 ± 45.1 72.8 ± 39.8 0.02 Reader average 84.7 ± 61.5 85.9 ± 57.8 0.13 The CAD system had a lesion-based sensitivity of 80.4% (41/51) at the operating level of 2.0 false-positive markings per image used in the study. The number of available CAD regions was 587. Table 4 shows that on average 274.2 of the 546 false-positive CAD regions (50.2%) were not queried. It also shows that on average 5 of the 41 true-positive CAD regions (12.2%) were not queried. The radiologists queried far fewer false-positive CAD regions than the non-radiologists. Table 4 Number of CAD regions queried Queried CAD regions Non-queried FP CAD regions Non-queried, unreported TP CAD regions Non-queried CAD regions but reported TP finding Non-radiologists 1 290 293 2 2 2 338 244 3 2 3 330 251 4 2 4 500 83 3 1 5 196 377 7 7 Average 330.8 249.6 3.8 2.8 Radiologists 6 176 396 8 7 7 262 319 6 0 8 209 365 9 4 9 444 140 3 0 Average 272.75 305 6.5 2.75 Reader average 305 274.22 5 2.78 There were 587 CAD regions in total; 546 false-positive CAD regions and 41 true-positive CAD regions