Feature subset selection can be categorized into embedded approaches, filter approaches, and wrapper approaches. This paper presents the wrapper subset selection problem and some algorithms to obtain good subsets. Wrapper subset selection methods are black-box optimization techniques.
First let's look at what the wrapper search space looks like in the figure below. We want to find a subset of features which maximize the performance of our classification model so each node in the graph is a subset of all the features. For a set of $n$ features there are $2^n$ unique subsets. A wrapper method approaches the problem by only looking at the graph structure and optionally evaluating each node during a search to determine how well it performs. The edges in the graph represent adding and removing one feature from the subset.
![](http://i.imgur.com/is9WLJ9.png)
Kohavi presents two search algorithms hill-climbing and best-first search. Hill-climbing (aka greedy) evaluates all neighbor nodes and picks the best one to start searching from next. Best-first evaluates $k$ neighbors and then picks the best one.