[CVPR2019/PaperSummary]Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization

Person Attribute Recognition
  1. Introduction
  1. An end-to-end trainable framework that performs attribute-specific localization at multiple scales to discover the most discriminative attribute regions in a weakly supervised manner.
  2. A feature pyramid architecture by leveraging both low-level details and high-level semantics to enhance the multi-scale attribute localization and
    region-based feature learning in a mutually reinforcing manner. The multi-scale attribute predictions are further fused by an effective voting scheme.
  3. Extensive experimentation on three publicly available pedestrian attribute datasets (PETA, RAP, and PA-100K ) and achieve significant improvement over the previous state-of-the-art methods.
Figure1 . Overview of the proposed framework. The input pedestrian image is fed into the main network with both bottom-up and topdown pathways. Features combined from different levels are fed into multiple Attribute Localization Modules (Figure 2), which perform attribute-specific localization and region-based feature learning. Outputs from different branches are trained with deep supervision and
aggregated through an element-wise maximum operation for inference. M is the total number of attributes. Best viewed in color
Eq1: Feature concatenation
Eq2: Dimensionality reduction
Figure 2. Details of the proposed Attribute Localization Module(ALM), which consists of a tiny channel attention sub-network and a simplified spatial transformer. The ALM takes the combined
features Xi as input and produces an attribute-specific prediction. Each ALM only serves one attribute at a singe level
Eq3: Transformation matrix
Eq4:Classification formulae
Eq5: Loss calculation
Eq6: Total Loss
Eq7: mean Accuracy per attribute
Table 1. Performance comparisons on RAP dataset
Figure3 Attribute-wise mA comparison on RAP dataset between our proposed method and the baseline model.
Figure 4. Visualization of attribute localization results at different
feature levels. Best viewed in color
Figure 5. Case studies of different attribute-specific localization methods on three different attributes: Boots (Top), Glasses (Middle), and Box (Bottom)
Table 2. Quantitative comparisons against previous methods on PETA and RAP datasets. We divide these methods into four groups: holistic
methods, relation-based methods, attention-based methods, and part-based methods
Table 3. Quantitative comparisons on PA-100K dataset
  • Light Weight model with high fps speed
  • Distinct classification of similar attributes is poor .
  • Need to have an agnostic voting scheme of attributes




Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Machine Learning: How should I attempt to start?

Machine Learning Robustness and Domain Adaptation

Using A.I. To Hack Your Fantasy Lineup

Moving Up The Value Chain in Machine Learning

Simpsons’ Characters Classifier With Convolutional Neural Network in Keras Library

Introduction to Deep Neural Networks for newbies in Data science.

Number Plate Detection using openCV

Multiclass Classification using one-vs.-rest approach

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


More from Medium

Hand Gesture Recognition for Numbers using TinyML

Get plane equation from givent points in 3D coordinate system Python

Containerize CUDA without NVIDIA CUDA image🤔? How?

Object Detection made simpler with IceVision (Part-1)