Abstract: |
We performed a preliminary investigation to determine how similar radiologists’ interpretations of screening mammograms are. Our dataset consisted of 50 cancer cases and 50 normal cases that were read by 50 radiologists. We computed sensitivity, specificity, and interpretation on a case-by-case basis to study similarity between pairs of radiologists. We failed to find any pairs of radiologists who read all the cases, only the normal cases, or only the cancer cases the same. There were very few radiologists who read both cancer cases and normal cases in a similar manner. Even radiologists who had similar sensitivities or similar specificities differed substantially on the interpretation of individual cases. Our data indicate that there may be an underlying variability between radiologists in terms of image features used to detect cancers and when a false detection is made. This underlying variability may make the development and implementation of model observers more difficult. © Springer International Publishing Switzerland 2016. |