Stacked Hourglass Networks for Human Pose Estimation
Alejandro Newell
and
Kaiyu Yang
and
Jia Deng
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.CV
First published: 2016/03/22 (8 years ago) Abstract: This work introduces a novel Convolutional Network architecture for the task
of human pose estimation. Features are processed across all scales and
consolidated to best capture the various spatial relationships associated with
the body. We show how repeated bottom-up, top-down processing used in
conjunction with intermediate supervision is critical to improving the
performance of the network. We refer to the architecture as a 'stacked
hourglass' network based on the successive steps of pooling and upsampling that
are done to produce a final set of estimates. State-of-the-art results are
achieved on the FLIC and MPII benchmarks outcompeting all recent methods.