Supervised Contrastive Learning
Prannay Khosla
and
Piotr Teterwak
and
Chen Wang
and
Aaron Sarna
and
Yonglong Tian
and
Phillip Isola
and
Aaron Maschinot
and
Ce Liu
and
Dilip Krishnan
arXiv e-Print archive - 2020 via Local arXiv
Keywords:
cs.LG, cs.CV, stat.ML
First published: 2024/12/22 (just now) Abstract: Contrastive learning applied to self-supervised representation learning has
seen a resurgence in recent years, leading to state of the art performance in
the unsupervised training of deep image models. Modern batch contrastive
approaches subsume or significantly outperform traditional contrastive losses
such as triplet, max-margin and the N-pairs loss. In this work, we extend the
self-supervised batch contrastive approach to the fully-supervised setting,
allowing us to effectively leverage label information. Clusters of points
belonging to the same class are pulled together in embedding space, while
simultaneously pushing apart clusters of samples from different classes. We
analyze two possible versions of the supervised contrastive (SupCon) loss,
identifying the best-performing formulation of the loss. On ResNet-200, we
achieve top-1 accuracy of 81.4% on the ImageNet dataset, which is 0.8% above
the best number reported for this architecture. We show consistent
outperformance over cross-entropy on other datasets and two ResNet variants.
The loss shows benefits for robustness to natural corruptions and is more
stable to hyperparameter settings such as optimizers and data augmentations. In
reduced data settings, it outperforms cross-entropy significantly. Our loss
function is simple to implement, and reference TensorFlow code is released at
https://t.ly/supcon.