[link]
## **Introduction** This paper presents a novel interactive tool for efficient and reproducible boundary extraction with minimal user input. Despite the user not being very accurate with the manual marking of the seed points, the algorithm snaps the boundary to the nearest strong object edge. Unlike active contour models where the user is unaware of how the final boundary will look like after energy minimization (which, if unsatisfactory, requires the entire process to be repeated again), this algorithm is interactive and therefore the user is aware of the "live-wire boundary snapping". Moreover, "boundary cooling" and "on-the-fly training" are two novel contributions of the paper which help reduce user input while maintaining the stability of the boundary. ## **Method** Modeling the boundary detection problem as a graph search, the new objective is to find the optimal path between a start node (pixel) to a goal node (pixel), where the total cost of any path is the sum of the local costs of the path. Let the local cost of a directed edge from pixel ${p}$ to a neighboring pixel ${q}$ be $$ l({p}, {q}) = \underbrace{0.43f_Z({q})}_{\text{Laplacian zero crossing}} + \underbrace{0.43f_G({q})}_{\text{gradient magnitude}} + \underbrace{0.14f_D({p}, {q})}_{\text{gradient direction}} $$ where the weights have been empirically chosen by the authors. The gradient magnitude $f_G({g})$ needs to be a term so that higher image gradients will correspond to lower costs in order to provide a "first-order" positioning of the live wire boundary. As such, the gradient magnitude G is defined as $$ f_G = 1 - \frac{G}{max(G)} $$ The Laplacian zero crossing term is a binary edge feature and acts as a "second order" fine tuning term for boundary localization. $$ f_Z({q}) = \left\{ \begin{array}{@{}ll@{}} 0, & \text{if}\ I_L({q}) = 0 \: or \: a \: neighbor \: with \: opposite \: sign \\ 1, & \text{otherwise} \\ \end{array}\right. $$ where $I_L$ represents the convolution of the image with a Laplacian edge operator. The gradient direction term is associated with penalizing sharp changes in the boundary direction and therefore effectively adds a boundary smoothness constraint. Let ${D(p)}$ be the unit vector normal to the image gradient at pixel ${p}$. Then the gradient direction cost can be represented as: $$ f_D({p}, {q}) = \frac{2}{3\pi}\big\{cos[d_p(({p}, {q})]^{-1} + cos[d_q(({p}, {q})]^{-1}\big\} $$ where $$ d_p(({p}, {q}) = {D(p).L(p,q)} $$$$ d_p(({p}, {q}) = {L(p,q).D(q)}, \: and \text{} \\ $$$$ {L(p,q)} = \left\{ \begin{array}{@{}ll@{}} {q-p}, & \text{if}\ {D(p).(q-p)} \geq 0 \\ {p-q}, & \text{otherwise} \\ \end{array}\right. $$ where ${L(p,q)}$ represents the normalized the bidirectional edge between pixels ${p}$ and ${q}$ and represents the direction of the link between them so that the difference between ${p}$ and this direction is minimized. Intuitively, this cost is low when the gradient direction of the two pixels are similar to each other and the link between them. Starting at a user-selected seed point, the boundary finding is continued in the direction of the minimum cumulative cost, which creates a dynamic 'wavefront' along the directions to highest interest (which happen to be along the edges). For a path length of $n$, the boundary growing requires $n$ iterations, which is significantly more efficient than the previously used approaches. A distribution of a variety of features (such as the image gradient $f_G$ weighted with a monotonically decreasing function (either linear or Gaussian) determines the contribution of each of the closest $t$ pixels, and the algorithm follows the edge of current interest (rather than the strongest) and associates lower costs with current edge features and higher costs for edge features not belonging to the current edge, thereby performing a dynamic or "on-the-fly" training. The algorithm also exhibits what the paper calls "data-driven path cooling" - as the cursor (and therefore the free point) moves further away from the seed point, progressively longer portions of the boundary become fixed and only the new parts of the boundary need to be updated. ## **Results and Discussion** Using the algorithm, the boundaries are extracted in one-fifth of the time required for manually tracing the boundary, while doing so with a 4.4x higher accuracy and 4.8x higher reproducibility. However, the paper admits that boundary detection can be difficult with objects with "hair like boundaries". Moreover, this technique cannot be extended to N-dimensional images. |
[link]
## **Introduction** This paper presents a stochastic active contour model for image segmentation from cardiac MR images. The proposed algorithms aims to minimize an energy functional by the level set method while incorporating stochastic region based and edge based information as well as shape priors of the heart and local contour properties. Moreover, the algorithm also uses a parameter annealing component to dynamically balance the weightage of the components of the energy functional. ## **Method** The paper locates a contour $C$ in a cardiac MR image that segments the image into two groups - the heart and the background. The corresponding objective energy functional can be represented by $$ J(C) = \lambda_1 J_1(C) + \lambda_2 J_2(C) + \lambda_3 J_3(C) + \lambda_4 J_4(C) $$ Looking at these individual components, we have * _Region Based Term: Model Matching_ **$J_1(\phi)$**: In order to segment the image into 2 regions, let the two regions inside and outside the contour C be represented by $\Omega_1$ and $\Omega_2$ respectively. For each region, consider a stochastic model to describe the pixel statistics of that region. Assuming that the pixel intensities of all the pixels in each region are statistically independent, the objective is to minimize the negative log-likelihood of pixels belonging to the correct regions. * _Edge Based Term_ **$J_2(\phi)$**: In order for the contour C to be aligned to the prominent edges in the image, the edge map of the image (which can be obtained by various image smoothing methods such as Gaussian kernel blurring, edge-preserving anisotropic diffusions, Min/Max flow algorithms, etc.) has to be minimized. * _Heart Shape Prior Term_ **$J_3(\phi)$**: In order to distinguish between similar looking tissues in the foreground and the background, an elliptical heart shaped prior is used. An ellipse can be described with 5 parameters with certain constraints in the conic equation. * _Contour Smoothing Term_ **$J_4(\phi)$**: In order to obtain a smooth contour of the segmented heart, the total Euclidean arc length of the contour C should be minimized. The parameters ($\lambda_1$, $\lambda_2$, and $\lambda_3$) of the energy functional $J(C)$ need to be dynamically updated during the energy minimization. For example, the region-based and the edge-based terms should have a higher weightage in $J(C)$ during the initial steps of the segmentation, and at the later stages, their weightage should be reduced and that of the shape prior should be increased in order to keep the segmented output similar to the desired shape. ## **Results** The two metrics used for assessing the performance were Area Similarity and Shape Similarity. The algorithm was tested on 48 images covering 143 contours, including manually annotated contours by an expert on six rat cardiac sequences of eight frames each, and the results indicated excellent segmentation agreement with the manually traced contours. ## **Discussion and Shortcomings** Since STACS uses stochastic models instead of deterministic models, it can be applied to a large variety of images, and is especially helpful when distinguishing between visually similar adjacent regions. Since STACS incorporates both region-based and edge-based information in its energy functional, this makes it more robust to noise as well as reduces the susceptibility to curve initialization. Perhaps the most highlighting feature of STACS that distinguishes it from other active contour based models is that it incorporates shape based priors into the energy functional. This helps segment the heart from the chest wall, which is especially difficult since the two regions share similar texture. Moreover, the scheduled parameter annealing adjusts the weights of the components of the energy functional, which helps dynamically vary the importance to different components at different stages of the segmentation process. |
[link]
## **Introduction** This paper presents an active contour model to detect and segment objects in images whose boundaries may or may not be defined by their gradients. The curve evolution is based on the Mumford Shah energy functional and the level set methods, making it less susceptible to curve initialization errors. ## **Method** Using an implicit shape representation, the boundary(s) C can be represented using the zero level set of an embedding function $\phi : \Omega \rightarrow$ **R** such that $$ C = \{x \in \Omega \:|\: \phi(x) = 0\} $$ Such a representation of the boundary does not require a choice of parameterization. The embedding function $\phi(.)$ is defined as $$ \phi(x) = \pm \: distance(x,C) $$ where the sign is either + or - depending upon whether the point is inside or outside the curve. This means that the boundary can be determined by finding out the point at which $\phi$ changes its sign. According to the level set method [1], the embedding function is zero at all the points on the curve C at any time. $$ \phi(C(t),\:t) = 0 \:\: \forall t $$ Consider the piecewise Mumford Shah energy functional model [2] with 2 regions $\Omega_{1}$ and $\Omega_{2}$ such that $\Omega = \Omega_{1} + \Omega_{2}$. Let us define the Heaviside step function as $$ H\phi \equiv H(\phi)\left\{ \begin{array}{@{}ll@{}} 1, & \text{if}\ \phi > 0 \: \Rightarrow x \in \Omega_{1} \\ 0, & \text{otherwise}\ \Rightarrow x \in \Omega_{2} \\ \end{array}\right. $$ Moving from an energy functional defined on the boundary C to one defined on the embedding function $\phi$, we get $$ E(\phi) = \int\limits_{\Omega_{1}} \: ({I}(x,y) - \mu_{1})^{2} \: dx + \int\limits_{\Omega_{2}} \: ({I}(x,y) - \mu_{2})^{2} \: dx + \nu \left| \partial \Omega_{1} \right| $$ where we approximate the average brightness/intensity of all the pixels in the $\Omega_{1}$ and $\Omega_{2}$ regions to be $\mu_{1}$ and $\mu_{2}$ respectively. The term $\left| \partial \Omega_{1} \right|$ represents the length of the boundary and is added as a penalizer/regularizer. Simplifying this expression, we get $$ \begin{align*} E(\phi) & = \int\limits_{\Omega} \: \Big[\big[({I}(x,y) - \mu_{1})^{2} - ({I}(x,y) - \mu_{2})^{2}\big]H\phi \\ & + ({I}(x,y) - \mu_{2})^{2}\Big]\:dx + \nu \int\limits_{\Omega}\left| \nabla H\phi \right|\:dx \end{align*} $$ Since $H(\phi)$ is not differentiable everywhere, we assume a slightly smoothened Heaviside function such that $$ \frac{\mathrm{d} H(\phi)}{\mathrm{d} \phi} = \delta(\phi) $$ One such choice of the smoothened delta function is $$ \delta_{\epsilon}(\phi) = \frac{1}{\pi} \frac{\epsilon}{\epsilon^{2} + \phi^{2}}, \: \epsilon > 0 $$ Now, the gradient descent equation can be computed as $$ \frac{\partial \phi}{\partial t} = \delta(\phi) \Big[\: \nu\:div\Big(\frac{\nabla \phi}{\left|\nabla \phi\right|}\Big) + ({I}(x,y) - \mu_{1})^{2} - ({I}(x,y) - \mu_{2})^{2}\Big] $$ As the embedding function $\phi(.)$ evolves over time, the boundary points can be detected when it undergoes a sign change. ## **Discussion and Shortcomings** As compared to the edge-based segmentation approaches such as geodesic active contours [3], this method uses region based segmentation, and can, therefore, result in curve evolution that is not just local. The segmentation results are not dependent on the initialization of the curve. Because of the choice of the embedding function $\phi$ as a signed distance function, the curves can split and merge. Moreover, unlike geodesic active contours [3], new curves can also be formed. This is because of the numerical approximation of the delta function - since the delta function never actually goes to 0. This means that objects with interior contours such as a ring can also be segmented, which was not possible with previously available active contour models. #### [1] S. Osher, and J. A. Sethian, "Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton-Jacobi formulations,'' *Journal of Computational Physics*, vol. 79, no. 1, pp. 12-49, 1988. #### [2] D. Mumford, and J. Shah, "Optimal approximations by piecewise smooth functions and associated variational problems,'' *Communications on Pure and Applied Mathematics*, vol. 42, no. 5, pp. 577-685, 2006. #### [3] V. Caselles, R. Kimmel, and G. Sapiro, "Geodesic Active Contours,'' *International Journal of Computer Vision*, vol. 22, no. 1, pp. 61-79, February 1997. |
[link]
## **Introduction** This paper presents a generalized technique of matching a deformable model to an image through energy minimization. An active contour model called snakes has been proposed which is a continuous and deformable spline always minimizing its energy under the influence of image forces as well as internal energy of the spline and external constraint forces. Because of its dynamic nature, snakes can be applied to image segmentation tasks (edges, lines, and subjective contours) as well as to motion tracking and stereo matching. ## **Method** A snake can be parametrically represented as $$ {v}(s) = (x(s), y(s)) $$ The energy functional of a snake consists of the internal and the external energy components. $$ E_{snake}(C) = \underbrace{E_{ext}(C)}_{\text{data term}} + \underbrace{E_{int}(C)}_{\text{regularization term}} $$ This can be decomposed into the following components: $$ E^{*}_{snake} = \int_{0}^{1} \Big[\: E_{int}({v}(s)) + E_{image}({v}(s)) + E_{constr}({v}(s)) \:\Big] \: ds $$ where the components represent the internal energy of the snake due to its shape and bending, the image forces, and the external constraint forces on the snake respectively. The internal energy of the snake can be represented as $$ E_{int} = \frac{1}{2} \{\alpha (s)\left | {v}_{s}(s) \right |^{2} \:+\: \beta (s)\left | {v}_{ss}(s) \right |^{2}\} $$ where the first order term and the second order term represent the elastic length and the stiffness of the snake respectively. The energy minimization is a $O(n)$ technique in time using sparse matrix methods (semi-implicit Euler method), which is is much faster than a $O(n^{2})$ fully explicit method. The energy due to the image forces is composed of three components: $$ E_{image} = w_{line}E_{line} + w_{edge}E_{edge} + w_{terminal}E_{terminal} $$ where $$E_{line} = -\big( G_{\sigma} * \nabla^{2}{I} \big)^{2},$$ $$ E_{edge} = -\left | \nabla {I}(x,y) \right |^{2},$$ $$E_{term} = \frac{\partial \theta}{\partial {n}_{\perp}} = \frac{C_{yy}C_{x}^{2} - 2C_{xy}C_{x}C_{y} + C_{xx}C_{x}^{2}}{(C_{y}^{2} + C_{x}^{2})^\frac{3}{2}}$$ * The minima of the $E_{line}$ energy functional define the edges of the image. Depending upon the sign of the $w_{line}$ term, the snake tries to align itself to the lightest or the darkest contour near it. Smoothening the image gradient with a Gaussian allows the snake to reach an equilibrium on a blurred energy functional, and when the blurring is slowly reduced, the snake stays in shape despite the small scale textural details because of its own smoothness constraint. * Because of the $E_{edge}$ term, the snake is attracted to the contours corresponding to large image gradients. * Consider $C(x,y)$ to be a slightly smoothened image obtained by applying a Gaussian filter to it. The $E_{terminal}$ represents the energy corresponding to the curvature of the level contours of $C(x,y)$, and can be calculated as the rate of change of the gradient angle $\theta$ along the direction of the outer unit normal to the curve. The combination of $E_{edge}$ and $E_{terminal}$ ensures that during the energy minimization, the snake is attracted to the edges and/or the terminations. Snakes can also be applied to the problem of stereo matching by adding an additional constraint to the energy functional. The constraint is that the disparities in the human visual system do not change too rapidly with space, meaning the following disparity has to be minimized, where the ${v}^{L}(s)$ and the ${v}^{R}(s)$ represent the left and the right snake contours respectively. $$ E_{stereo} = \big( {v}^{L}_{s}(s) - {v}^{L}_{s}(s) \big)^{2} $$ Similarly, snakes can also be applied to the problem of motion tracking since the snakes are able to "lock on" to salient visual features and can, therefore, track them accordingly across frames. ## **Discussion and Shortcomings** The snakes were the first variational approach to the image segmentation task, and therefore this was a seminal paper in image processing. By introducing the gradient (edge strength) as an energy term, they improved on the shortcomings of the prior published works. Because snakes are "active" and are therefore always minimizing their energy, they show hysteresis in response to moving input. The energy minimization for the snakes is achieved through gradient descent, and since the energy functional is not convex, it can get stuck on local minima. Therefore, a prerequisite for the snakes is that the curve initialization must be done sufficiently close to the desired solution. Moreover, because of numerical instabilities, as the control points move in time, the curve can have self-intersections which is undesired in image segmentation tasks. |