[link]
Low level tasks such as edge, contour and line detection are an essential precursor to any downstream image analysis processes. However, most of the approaches targeting these problems work as isolated and autonomous entities, without using any high-level image information such as context, global shapes, or user-level input. This leads to errors that can further propagate through the pipeline without providing an opportunity for future correction. In order to address this problem, Kass et al. investigate the application of an energy minimization based framework for edge, line and contour detection in 2D images. Although energy minimization had earlier been utilized for similar tasks, Kass et al’s framework exposes a novel external force factor, which allows external forces or stimuli to guide the ``snake" towards a “correct answer”. Moreover, the framework exhibits an active behaviour since it is designed to always minimize the energy functional. A “snake” is a controlled continuity spline which is under the influence of forces in the energy functional. The energy functional is made up of three terms, where $v(s)$ is a parametrized snake. $ E_{snake}^{*} = \int_{0}^{1}E_{internal}(v(s)) + E_{image}(v(s)) + E_{constraint}(v(s)) $ The internal energy force term is entirely dependent on the shape of the curve, which constraints the snake to be “continuous” and well behaved. The force further encapsulates two terms which control the degree of stretchness of the curve (represented by the first derivative of the image), and ensure that the curve does not have too many bends (using the second derivative of the image). The image energy force term controls what kind of salient features does the snake track. The force encapsulates three terms, corresponding to the presence of lines, edges and a termination term. The line term uses raw image intensity as energy term, while the edge term uses a negative square of image gradient to make the snake attracted to contours with large image gradients. Further, a termination term allows the snake to find terminations of line segments and corners in the image. The combination of line and termination terms forces the snake to be attracted to edges and terminations. The constraint term is used to model external stimuli, which can come from high-level processes, or through user intervention. The framework allows the application of a spring to the curve to constraint the snake or move it in a desired direction. In order to test the framework, a user interface called ``Snake Pit" was created which allowed user control in the form of a spring attachment. The overall approach was tested on a number of different images, including the ones with contour illusion. The framework was also extended for application to stereo matching and motion tracking. For stereo, an extra energy term is added to constraint the snakes in two disparate images to stay close in the coordinate space, which models the fact that disparities in images do not change too rapidly. However the framework suffers when subjected to a high rate-of-change in both stereo matching and motion tracking problems. The proposed framework performed acceptably in many challenging scenarios. However the framework's underlying assumption about following edges and contours using image gradients may fail in cases where there is not much gradient information present in images.
Your comment:
|
[link]
## **Introduction** This paper presents a generalized technique of matching a deformable model to an image through energy minimization. An active contour model called snakes has been proposed which is a continuous and deformable spline always minimizing its energy under the influence of image forces as well as internal energy of the spline and external constraint forces. Because of its dynamic nature, snakes can be applied to image segmentation tasks (edges, lines, and subjective contours) as well as to motion tracking and stereo matching. ## **Method** A snake can be parametrically represented as $$ {v}(s) = (x(s), y(s)) $$ The energy functional of a snake consists of the internal and the external energy components. $$ E_{snake}(C) = \underbrace{E_{ext}(C)}_{\text{data term}} + \underbrace{E_{int}(C)}_{\text{regularization term}} $$ This can be decomposed into the following components: $$ E^{*}_{snake} = \int_{0}^{1} \Big[\: E_{int}({v}(s)) + E_{image}({v}(s)) + E_{constr}({v}(s)) \:\Big] \: ds $$ where the components represent the internal energy of the snake due to its shape and bending, the image forces, and the external constraint forces on the snake respectively. The internal energy of the snake can be represented as $$ E_{int} = \frac{1}{2} \{\alpha (s)\left | {v}_{s}(s) \right |^{2} \:+\: \beta (s)\left | {v}_{ss}(s) \right |^{2}\} $$ where the first order term and the second order term represent the elastic length and the stiffness of the snake respectively. The energy minimization is a $O(n)$ technique in time using sparse matrix methods (semi-implicit Euler method), which is is much faster than a $O(n^{2})$ fully explicit method. The energy due to the image forces is composed of three components: $$ E_{image} = w_{line}E_{line} + w_{edge}E_{edge} + w_{terminal}E_{terminal} $$ where $$E_{line} = -\big( G_{\sigma} * \nabla^{2}{I} \big)^{2},$$ $$ E_{edge} = -\left | \nabla {I}(x,y) \right |^{2},$$ $$E_{term} = \frac{\partial \theta}{\partial {n}_{\perp}} = \frac{C_{yy}C_{x}^{2} - 2C_{xy}C_{x}C_{y} + C_{xx}C_{x}^{2}}{(C_{y}^{2} + C_{x}^{2})^\frac{3}{2}}$$ * The minima of the $E_{line}$ energy functional define the edges of the image. Depending upon the sign of the $w_{line}$ term, the snake tries to align itself to the lightest or the darkest contour near it. Smoothening the image gradient with a Gaussian allows the snake to reach an equilibrium on a blurred energy functional, and when the blurring is slowly reduced, the snake stays in shape despite the small scale textural details because of its own smoothness constraint. * Because of the $E_{edge}$ term, the snake is attracted to the contours corresponding to large image gradients. * Consider $C(x,y)$ to be a slightly smoothened image obtained by applying a Gaussian filter to it. The $E_{terminal}$ represents the energy corresponding to the curvature of the level contours of $C(x,y)$, and can be calculated as the rate of change of the gradient angle $\theta$ along the direction of the outer unit normal to the curve. The combination of $E_{edge}$ and $E_{terminal}$ ensures that during the energy minimization, the snake is attracted to the edges and/or the terminations. Snakes can also be applied to the problem of stereo matching by adding an additional constraint to the energy functional. The constraint is that the disparities in the human visual system do not change too rapidly with space, meaning the following disparity has to be minimized, where the ${v}^{L}(s)$ and the ${v}^{R}(s)$ represent the left and the right snake contours respectively. $$ E_{stereo} = \big( {v}^{L}_{s}(s) - {v}^{L}_{s}(s) \big)^{2} $$ Similarly, snakes can also be applied to the problem of motion tracking since the snakes are able to "lock on" to salient visual features and can, therefore, track them accordingly across frames. ## **Discussion and Shortcomings** The snakes were the first variational approach to the image segmentation task, and therefore this was a seminal paper in image processing. By introducing the gradient (edge strength) as an energy term, they improved on the shortcomings of the prior published works. Because snakes are "active" and are therefore always minimizing their energy, they show hysteresis in response to moving input. The energy minimization for the snakes is achieved through gradient descent, and since the energy functional is not convex, it can get stuck on local minima. Therefore, a prerequisite for the snakes is that the curve initialization must be done sufficiently close to the desired solution. Moreover, because of numerical instabilities, as the control points move in time, the curve can have self-intersections which is undesired in image segmentation tasks. |