HardMo: A Large-Scale Hardcase Dataset for Motion Capture

1Beijing University of Posts and Telecommunications, 2Institute of Automation, Chinese Academy of Sciences, 3Centre for Artificial Intelligence and Robotics, HKISI, CAS 4University of Chinese Academy of Sciences 5University of Science and Technology Beijing 6Qinghai University of Science and Technology

Abstract

With the advent of deep learning techniques and large-scale datasets, recent years have witnessed rapid progress in monocular human mesh recovery. Despite the impressive performance of public benchmarks, existing methods are vulnerable to unusual poses, which prevents them from practical deployment to scenarios such as dance and martial arts.This issue is mainly attributed to the domain gap induced by data scarcity in relevant cases. However, most public datasets are captured under constrained settings and lack samples of such complex movements.

To mitigate data scarcity, we propose a pipeline for automatic data crawling, precise annotation, and hardcase mining. Based on this pipeline, we establish a large dataset in a short time. The dataset, named HardMo, contains 7M images along with precise annotations covering 15 categories of dance and 14 categories of martial arts. According to our observation, the failure in the two scenarios is mainly characterized by incorrect posture of hand-wrist and foot-ankle. For further investigation in the two hardcases, we leverage the proposed automatic pipeline to filter collected data and establish two subsets named HardMo-Hand and HardMo-Foot.

Extensive experiments demonstrate the efficacy of the annotation pipeline and collected dataset. Specifically, after being trained on HardMo, HMR, an early pioneering method, can even outperform the current state of the art, 4DHumans, on our benchmarks.

Video

More Comparisons

From left to right: 1.HardMo-4DHumans(Ours), 2.4DHumans, 3.ProHMR,

HardMo Datasets

HardMo: A Large-Scale Hardcase Dataset with the following features:
  • More than 300 different scenarios
  • contains 7M images along with precise annotations
  • 15 categories of dance and 14 categories of martial arts.
  • over 500K HardMo-Foot and over 400K HardMo-Hand Subsets