Distributional Smoothing with Virtual Adversarial Training
Takeru Miyato
and
Shin-ichi Maeda
and
Masanori Koyama
and
Ken Nakae
and
Shin Ishii
arXiv e-Print archive - 2015 via Local arXiv
Keywords:
stat.ML, cs.LG
First published: 2015/07/02 (9 years ago) Abstract: We propose local distributional smoothness (LDS), a new notion of smoothness
for statistical model that can be used as a regularization term to promote the
smoothness of the model distribution. We named the LDS based regularization as
virtual adversarial training (VAT). The LDS of a model at an input datapoint is
defined as the KL-divergence based robustness of the model distribution against
local perturbation around the datapoint. VAT resembles adversarial training,
but distinguishes itself in that it determines the adversarial direction from
the model distribution alone without using the label information, making it
applicable to semi-supervised learning. The computational cost for VAT is
relatively low. For neural network, the approximated gradient of the LDS can be
computed with no more than three pairs of forward and back propagations. When
we applied our technique to supervised and semi-supervised learning for the
MNIST dataset, it outperformed all the training methods other than the current
state of the art method, which is based on a highly advanced generative model.
We also applied our method to SVHN and NORB, and confirmed our method's
superior performance over the current state of the art semi-supervised method
applied to these datasets.