Unknown

Dataset Information

0

Video captioning based on vision transformer and reinforcement learning.


ABSTRACT: Global encoding of visual features in video captioning is important for improving the description accuracy. In this paper, we propose a video captioning method that combines Vision Transformer (ViT) and reinforcement learning. Firstly, Resnet-152 and ResNeXt-101 are used to extract features from videos. Secondly, the encoding block of the ViT network is applied to encode video features. Thirdly, the encoded features are fed into a Long Short-Term Memory (LSTM) network to generate a video content description. Finally, the accuracy of video content description is further improved by fine-tuning reinforcement learning. We conducted experiments on the benchmark dataset MSR-VTT used for video captioning. The results show that compared with the current mainstream methods, the model in this paper has improved by 2.9%, 1.4%, 0.9% and 4.8% under the four evaluation indicators of LEU-4, METEOR, ROUGE-L and CIDEr-D, respectively.

SUBMITTER: Zhao H 

PROVIDER: S-EPMC9044334 | biostudies-literature | 2022

REPOSITORIES: biostudies-literature

altmetric image

Publications

Video captioning based on vision transformer and reinforcement learning.

Zhao Hong H   Chen Zhiwen Z   Guo Lan L   Han Zeyu Z  

PeerJ. Computer science 20220316


Global encoding of visual features in video captioning is important for improving the description accuracy. In this paper, we propose a video captioning method that combines Vision Transformer (ViT) and reinforcement learning. Firstly, Resnet-152 and ResNeXt-101 are used to extract features from videos. Secondly, the encoding block of the ViT network is applied to encode video features. Thirdly, the encoded features are fed into a Long Short-Term Memory (LSTM) network to generate a video content  ...[more]

Similar Datasets

| S-EPMC11312936 | biostudies-literature
| S-EPMC9940339 | biostudies-literature
| S-EPMC8356660 | biostudies-literature
| S-EPMC9570951 | biostudies-literature
| S-EPMC8926160 | biostudies-literature
| S-EPMC8691725 | biostudies-literature
| S-EPMC10892138 | biostudies-literature
| S-EPMC10928431 | biostudies-literature
2022-01-05 | GSE188791 | GEO
| S-EPMC8583247 | biostudies-literature