First published: 2016/09/21 (8 years ago) Abstract: Molecular profiling data (e.g., gene expression) has been used for clinical
risk prediction and biomarker discovery. However, it is necessary to integrate
other prior knowledge like biological pathways or gene interaction networks to
improve the predictive ability and biological interpretability of biomarkers.
Here, we first introduce a general regularized Logistic Regression (LR)
framework with regularized term $\lambda \|\bm{w}\|_1 +
\eta\bm{w}^T\bm{M}\bm{w}$, which can reduce to different penalties, including
Lasso, elastic net, and network-regularized terms with different $\bm{M}$. This
framework can be easily solved in a unified manner by a cyclic coordinate
descent algorithm which can avoid inverse matrix operation and accelerate the
computing speed. However, if those estimated $\bm{w}_i$ and $\bm{w}_j$ have
opposite signs, then the traditional network-regularized penalty may not
perform well. To address it, we introduce a novel network-regularized sparse LR
model with a new penalty $\lambda \|\bm{w}\|_1 + \eta|\bm{w}|^T\bm{M}|\bm{w}|$
to consider the difference between the absolute values of the coefficients. And
we develop two efficient algorithms to solve it. Finally, we test our methods
and compare them with the related ones using simulated and real data to show
their efficiency.