首页 > 其他分享> > 【论文阅读】:NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

【论文阅读】:NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis


NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

(2016 CVPR)

Amir Shahroudy, Jun Liu, Tian-Tsong Ng, Gang Wang



1、introduce a large-scale dataset for RGB+D human action recognition

2、propose a new recurrent neural network structure to model the long-term temporal correlation ofthe features for each body part

Limitations in Previous 3D Action Recognition Benchmarks

1、the small number of subjects and very narrow range of performers’ ages, which makes the intra-class variation of the actions very limited.

2、only a very small number of classes are available.

3、the highly restricted camera views. For most of the datasets, all the samples are captured from a front view with a fixed camera viewpoint.

4、the highly limited number of video samples prevents us from applying the most advanced data-driven learning methods to this problem.


Details of NTU RGB+D

1、the number of RGB+D video samples:56, 880

2、40 different human subjects

3、60 action classes in total:

3、Hardware:Microsoft Kinect v2

4、Data Modality:

5、80(17 * 5)distinct camera viewpoints:

6、The age range of the subjects in our dataset is from 10 to 35 years

7、limited to indoor scenes, but we provide the ambiance inconstancy by capturing in various background conditions

8、cross-subject and cross-view evaluations metrics



Benchmark Evaluations


2、Cross-View Evaluation



Part-Aware LSTM Network

In our model, we group the body joints into five part groups: torso, two hands, and two legs. 

1、Traditional RNN and LSTM

 2、Proposed P-LSTM



 Experimantal Results



来源: https://blog.csdn.net/qq_36627158/article/details/119907320