The Do's and Don'ts for CNN-based Face Verification
Ankan Bansal
and
Carlos Castillo
and
Rajeev Ranjan
and
Rama Chellappa
arXiv e-Print archive - 2017 via Local arXiv
Keywords:
cs.CV
First published: 2017/05/21 (7 years ago) Abstract: While the research community appears to have developed a consensus on the
methods of acquiring annotated data, design and training of CNNs, many
questions still remain to be answered. In this paper, we explore the following
questions that are critical to face recognition research: (i) Can we train on
still images and expect the systems to work on videos? (ii) Are deeper datasets
better than wider datasets? (iii) Does adding label noise lead to improvement
in performance of deep networks? (iv) Is alignment needed for face recognition?
We address these questions by training CNNs using CASIA-WebFace, UMDFaces, and
a new video dataset and testing on YouTube- Faces, IJB-A and a disjoint portion
of UMDFaces datasets. Our new data set, which will be made publicly available,
has 22,075 videos and 3,735,476 human annotated frames extracted from them.