其他分享
首页 > 其他分享> > 【论文阅读】:NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

【论文阅读】:NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

作者:互联网

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

(2016 CVPR)

Amir Shahroudy, Jun Liu, Tian-Tsong Ng, Gang Wang

Notes

Contribution

1、introduce a large-scale dataset for RGB+D human action recognition

2、propose a new recurrent neural network structure to model the long-term temporal correlation ofthe features for each body part

Limitations in Previous 3D Action Recognition Benchmarks

1、the small number of subjects and very narrow range of performers’ ages, which makes the intra-class variation of the actions very limited.

2、only a very small number of classes are available.

3、the highly restricted camera views. For most of the datasets, all the samples are captured from a front view with a fixed camera viewpoint.

4、the highly limited number of video samples prevents us from applying the most advanced data-driven learning methods to this problem.

 

Details of NTU RGB+D

1、the number of RGB+D video samples:56, 880

2、40 different human subjects

3、60 action classes in total:

3、Hardware:Microsoft Kinect v2

4、Data Modality:

5、80(17 * 5)distinct camera viewpoints:

6、The age range of the subjects in our dataset is from 10 to 35 years

7、limited to indoor scenes, but we provide the ambiance inconstancy by capturing in various background conditions

8、cross-subject and cross-view evaluations metrics

 

 

Benchmark Evaluations

1、cross-subject

2、Cross-View Evaluation

 

 

Part-Aware LSTM Network

In our model, we group the body joints into five part groups: torso, two hands, and two legs. 

1、Traditional RNN and LSTM

 2、Proposed P-LSTM

 

 

 Experimantal Results

 

 

标签:Scale,Human,views,45,NTU,side,camera,samples,subjects
来源: https://blog.csdn.net/qq_36627158/article/details/119907320