Non-rigid Registration for Large Sets of Microscopic Images on Graphics Processors

J Signal Process Syst. 2009 Apr 1;55(1-3):229-250. doi: 10.1007/s11265-008-0208-4.


Microscopic imaging is an important tool for characterizing tissue morphology and pathology. 3D reconstruction and visualization of large sample tissue structure requires registration of large sets of high-resolution images. However, the scale of this problem presents a challenge for automatic registration methods. In this paper we present a novel method for efficient automatic registration using graphics processing units (GPUs) and parallel programming. Comparing a C++ CPU implementation with Compute Unified Device Architecture (CUDA) libraries and pthreads running on GPU we achieve a speed-up factor of up to 4.11× with a single GPU and 6.68× with a GPU pair. We present execution times for a benchmark composed of two sets of large-scale images: mouse placenta (16K × 16K pixels) and breast cancer tumors (23K × 62K pixels). It takes more than 12 hours for the genetic case in C++ to register a typical sample composed of 500 consecutive slides, which was reduced to less than 2 hours using two GPUs, in addition to a very promising scalability for extending those gains easily on a large number of GPUs in a distributed system.

Keywords: Feature detection; Graphics processors; High-performance computing; Image registration and segmentation; Microscopic imaging; Pattern analysis.