简体中文 | English
We optimized TSM model and proposed PPTSM in this paper. Without increasing the number of parameters, the accuracy of TSM was significantly improved in UCF101 and Kinetics-400 datasets. Please refer to Tricks on ppTSM for more details.
Please refer to Kinetics400 data download and preparation doc k400-data
Please refer to UCF101 data download and preparation doc ucf101-data
Please download ResNet50_vd_ssld_v2 as pretraind model:
wget https://videotag.bj.bcebos.com/PaddleVideo/PretrainModel/ResNet50_vd_ssld_v2_pretrained.pdparams
and add path to MODEL.framework.backbone.pretrained
in config file as:
MODEL:
framework: "Recognizer2D"
backbone:
name: "ResNet"
pretrained: your weight path
You can start training with different dataset using different config file. For UCF-101 dataset, we use 4 cards to train:
python -B -m paddle.distributed.launch --gpus="0,1,2,3" --log_dir=log_pptsm main.py --validate -c configs/recognition/tsm/pptsm.yaml
For Kinetics400 dataset, we use 8 cards to train:
python -B -m paddle.distributed.launch --gpus="0,1,2,3,4,5,6,7" --log_dir=log_pptsm main.py --validate -c configs/recognition/tsm/pptsm_k400.yaml
-
Args
-c
is used to specify config file. -
For finetune please download our trained model ppTSM.pdparams,and specify file path with
--weights
. -
For the config file usage,please refer to config.
python3 main.py --test -c configs/recognition/tsm/pptsm.yaml -w output/ppTSM/ppTSM_best.pdparams
- Download the published model ppTSM.pdparams, then you need to set the
--weights
for model testing
Accuracy on Kinetics400:
seg_num | target_size | Top-1 |
---|---|---|
8 | 224 | 0.735 |
Accuracy on UCF101:
seg_num | target_size | Top-1 |
---|---|---|
8 | 224 | 0.8997 |
To get model architecture file ppTSM.pdmodel
and parameters file ppTSM.pdiparams
, use:
python3 tools/export_model.py -c configs/recognition/tsm/pptsm_k400.yaml \
-p output/ppTSM/ppTSM_best.pdparams \
-o inference/ppTSM
- Args usage please refer to Model Inference.
python3 tools/predict.py --video_file data/example.avi \
--model_file inference/ppTSM/ppTSM.pdmodel \
--params_file inference/ppTSM/ppTSM.pdiparams \
--use_gpu=True \
--use_tensorrt=False
example of logs:
Current video file: data/example.avi
top-1 class: 5
top-1 score: 0.9621570706367493
we can get the class name using class id and map file data/k400/Kinetics-400_label_list.txt
. The top1 prediction of data/example.avi
is archery
.
- TSM: Temporal Shift Module for Efficient Video Understanding, Ji Lin, Chuang Gan, Song Han
- Distilling the Knowledge in a Neural Network, Geoffrey Hinton, Oriol Vinyals, Jeff Dean