This paper focuses on empirical results of various algorithms for text classification: my interest in it was its description of the sleeping experts algorithm. Roughly, in the experts model, there are many "experts" making a prediction on every sample, and we must learn which experts to trust: i.e. we will make a prediction based on the weighted predictions of the experts, and the weights might depend on the experts' past peformance. In "sleeping experts", some of the experts make no predictions on some samples. This paper has a good description, with pseudocode, of the sleeping expert algorithm, but does not derive any bounds. The paper also describes Ripper, an algorithm for learning rule sets.