Towards Real-World Power Grid Scenarios: Video Action Detection with Cross-scale Selective Context Aggregation

Lingwen Meng; Siwu Yu; Shasha Luo; Anjun Li

doi:10.5755/j01.itc.54.2.41005

Authors

Lingwen Meng Electric Power Research Institute of Guizhou Power Grid Co. Ltd, Guiyang 550002, China
Siwu Yu Electric Power Research Institute of Guizhou Power Grid Co. Ltd, Guiyang 550002, China
Shasha Luo Electric Power Research Institute of Guizhou Power Grid Co. Ltd, Guiyang 550002, China
Anjun Li Electric Power Research Institute of Guizhou Power Grid Co. Ltd, Guiyang 550002, China

DOI:

https://doi.org/10.5755/j01.itc.54.2.41005

Keywords:

Action Detection, Deep Learning, Video Understanding

Abstract

In this study, we propose a single-stage model for video action detection and a real-world action detection dataset POWER collected from real power operation scenarios. While previous studies have made significant progress in overall classification and localization performance, they often struggle with the actions that have short duration, hindering the application of these approaches. To address this, we introduce the Cross-scale Selective Context Aggregation Network (CSCAN), which focuses on improving the detection of short actions. This network integrates three key components: 1) a cross-scale feature conduction structure combined with a tailored alignment mechanism; 2) a selective context aggregation module based on gating mechanism; and 3) an effective scale-invariant consistency training strategy to enable the model to learn scale-invariant action representation. We evaluated our method on the self-collected dataset POWER and on the most widely used action detection benchmarks THUMOS14 and ActivityNet v1.3. The extensive results show that our model outperforms other approaches, especially in detecting real-world short actions, demonstrating the effectiveness of our approach.

Towards Real-World Power Grid Scenarios: Video Action Detection with Cross-scale Selective Context Aggregation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

crossref2

crossref

Information