首页 > 其他分享> > [paper reading][CVPR 2020] Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

[paper reading][CVPR 2020] Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

2021-11-18 20:01:08 作者：互联网

目录

2 Related Work
- General Video Classification
3
- 3.2 Spatio-Temporal Graph

CVPR 2020
https://openaccess.thecvf.com/content_CVPR_2020/papers/Pan_Spatio-Temporal_Graph_for_Video_Captioning_With_Knowledge_Distillation_CVPR_2020_paper.pdf
spatio-temporal graph model for video captioning that exploits object interactions in space and time
two-branch, knowledge distillatio

General Video Classification

3D conv
two-stream, optical flow
wider range
SlowFast, multiple time scales, two pathways
feature bank, long-term, correlated, short-term
raw pixels, in contrast, objects within scenes

3

two-branch, distill
scene, 2D, resnet, 3D, I3D
object features: \(N_T\) objects, each \(o_t^j\) has the same dimension

3.2 Spatio-Temporal Graph

decompose our graph into two components: the spatial graph and the temporal graph
Spatial: normalized Intersection over Union (IoU) value, explicitly
temporal: object transformations, semantic similarities, \(cos\)
imagine: # - % = $ x @ structure

标签：Captioning,Knowledge,Graph,two,2020,graph,Video,Spatio
来源： https://www.cnblogs.com/minor-second/p/15574107.html