[link]
Summary by Anmol Sharma 5 years ago
Over the last decade and a half, a plethora of image segmentation algorithms have been proposed, which can be categorized into belonging to roughly four categories, represented by a combination of two labels: explicit or implicit boundary representation, and variational or combinatorial methods. While classic methods like Snakes [1] and Level-Sets [2] belong to explicit/variational and implicit/variational category, there have been another set of algorithms falling under the combinatorial domain, which are DP or path-based algorithms which are explicit/combinatorial, and finally Graph-Cuts, which are implicit/combinatorial. The main difference between the categories is the space of solutions where search is performed. For variational methods, the search space is $\mathcal{R}^\infty$, while for combinatorial methods the search space is confined to $\mathcal{Z}^n$. An obvious advantage of combinatorial methods seem to better computational performance, but they also provide a globally optimal solution, which is global to the image. This makes the algorithm performance independent of numerical stability design decisions, and only dependent on the quality of global descriptors. Hence the algorithms provide a highly generalized, globally optimal framework which can be applied to a variety of problems, including image segmentation.
Graph-Cut methods are based upon the $s-t$ decomposition of a given image graph (where pixels are nodes and edges are formed between 8-connected neighbours). An $s-t$ decomposition of a graph $\mathcal{G} = \{V,E\}$ is a subset of edges $C \subset E$ such that the graph gets completely separated into individual components $s$ and $t$. The divided nodes are assigned to two terminal nodes, representing foreground and background of the image. The best-cut in an image graph is optimal if the cost of a cut (defined as $|C| = \sum_{e\in C}w_e$) is minimal. This corresponds to segmenting an image with a desirable balance of boundary and regional properties.
Graph-Cut exposes a general segmentation energy function (which constitutes the ``cost") as a combination of a regional term and boundary term, given as $E(A) = \lambda.R(A) + B(A)$. Regional term can be used to model apriori distribution of the pixel classes (probability of the pixel belonging to background or foreground). The boundary term can be represented by any boundary feature like local intensity gradients, zero-crossing, gradient direction or geometric costs. Although the region and boundary terms force the algorithm to find a boundary which strikes a good balance between the two, sometimes the lack of information for either or both terms may lead to incorrect segmentation. To offset these terms, hard constraints can be applied to the energy function. The constraints can be anything, for instance, a term that defines that pixels of particular intensities would belong to either foreground or background. The constraint can also be shape based, where shapes like circles or ellipses can be forced upon the final segmentation.
The proposed algorithm was applied on a variety of 2D and 3D images. Note that graph-cut can be generalized to N-D segmentation problems as well. A number of qualitative observations are reported. However there is a lack of quantitative foundation of the algorithm performance, compared to other state-of-art algorithms not necessarily from the same category as graph-cut.
more
less