Multi-Scale Deep Reinforcement Learning for Real-Time 3D-Landmark Detection in CT Scans
Ghesu, Florin C.
and
Georgescu, Bogdan
and
Zheng, Yefeng
and
Grbic, Sasa
and
Maier, Andreas K.
and
Hornegger, Joachim
and
Comaniciu, Dorin
IEEE Transactions on Pattern Analysis and Machine Intelligence - 2019 via Local Bibsonomy
Keywords:
dblp
Robust and fast detection of anatomical structures is a prerequisite for both diagnostic and interventional medical image analysis. Current solutions for anatomy detection are typically based on machine learning techniques that exploit large annotated image databases in order to learn the appearance of the captured anatomy. These solutions are subject to several limitations, including the use of suboptimal feature engineering techniques and most importantly the use of computationally suboptimal search-schemes for anatomy detection. To address these issues, we propose a method that follows a new paradigm by reformulating the detection problem
as a behavior learning task for an artificial agent.
To this end, Ghesu et al. reformulate the problem of landmark detection as a reinforcement learning task, where an agent is trained to navigate the 3D image space in an efficient way in order to find the landmark as spatially closely as possible. This paper marks the first time that RL has been used for a specific problem like this.
The authors use Deep Q-Learning (DQN) algorithm as their framework to build the agent. The DQN algorithm uses a Convolutional Neural Network to parameterize the Q* function in a non-linear way. Also, in order to ensure that the agent always finds an object of interest regardless of the size, the authors proposed to use multi-scale search strategy, in which a different CNN works on an image of a different scale.
The authors also some other interesting techniques like $\epsilon-$greedy approach, which uses a randomized strategy to choose the actions in every iteration. They also use experience replay where an experience buffer is maintained that records the trajectory of the agent. This buffer is then used to fine-tune the Q network.
The authors test their method on an internal data set of CT scans, with a few different metrics. The most important metric was failure percentage rate, that was defined if the agent is x mm close to the landmark. The method achieved 0\% failure percentage rate on detecting landmarks in CT scans.