[CVPR2021/PaperSummary]Rethinking Keypoint Representations:Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation

1. Introduction

  1. First, these methods suffer from quantization error; the precision of a keypoint prediction is inherently limited by the spatial resolution of the heatmap. Larger heatmaps are advantageous but costly in processing at higher resolutions.
  2. Second, when two key points of the same type (i.e., class) appear in close proximity to one another, the overlapping heatmap signals may be mistaken for a single key point.

2. Related Work

  1. Heatmap-free Keypoint Detection: DeepPose regressed keypoint coordinates directly from images using a cascade of deep neural networks that iteratively refined the keypoint predictions, heatmap regression has remained prevalent in human pose estimation, and the computational inefficiencies associated with generating heatmaps with the inherent issue of quantization error. Direct keypoint regression has also been attempted using Transformers [2].
  2. Single-stage Human Pose Estimation: Human pose estimation generally falls into two categories:

3. Proposed Method

3.1 Architectural Details

3.2 Loss Function

3.3 Inference

3.4 Limitations

4. Experiments

4.1 COCO Keypoints

4.2 CrowdPose

4.3 Ablation Studies

5. Conclusion


Writer’s Conclusion



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store