Background: Detection of ossification areas of hand bones in X-ray images is an important task, e.g. as a preprocessing step in automated bone age estimation. Deep neural networks have emerged recently as de facto standard detection methods, but their drawback is the need of large annotated datasets. Finetuning pre-trained networks is a viable alternative, but it is not clear a priori if training with small annotated datasets will be successful, as it depends on the problem at hand. In this paper, we show that pre-trained networks can be utilized to produce an effective detector of ossification areas in pediatric X-ray images of hands.
Methods and findings: A publicly available Faster R-CNN network, pre-trained on the COCO dataset, was utilized and finetuned with 240 manually annotated radiographs from the RSNA Pediatric Bone Age Challenge, which comprises over 14.000 pediatric radiographs. The validation is done on another 89 radiographs from the dataset and the performance is measured by Intersection-over-Union (IoU). To understand the effect of the data size on the pre-trained network, subsampling was applied to the training data and the training was repeated. Additionally, the network was trained from scratch without any pre-trained weights. Finally, to understand whether the trained model could be useful, we compared the inference of the network to an annotation of an expert radiologist. The finetuned network was able to achieve an average precision (mAP@0.5IoU) of 92.92 ± 1.93. Apart from the wrist region, all ossification areas were able to benefit from more data. In contrast, the network trained from scratch was not able to produce any correct results. When compared to the annotations of the expert radiologist, the network was able to localize the regions quite well, as the F1-Score was on average 91.85 ± 1.06.
Conclusions: By finetuning a pre-trained deep neural network, with 240 annotated radiographs, we were able to successfully detect ossification areas in prediatric hand radiographs.