PyTorch implementation of SD-MVSum, including S-VideoXum and S-MrHiSum datasets, released as part of our paper "SD-MVSum: Script-Driven Multimodal Video Summarization Method and Datasets"
script video-summarization video-dataset video-script cross-attention vision-language-models videoxum s-videoxum multimodal-video-summarization sd-mvsum s-mrhisum mrhisum weighted-cross-modal-attention
-
Updated
May 8, 2026 - Python