Small Sample Time Series Classification Based on Data Augmentation and Semi-supervised Learning

Authors

  • Jing-Jing Liu College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China Key Laboratory of Agricultural Information Acquisition Technology (Beijing), Ministry of Agriculture, Beijing 100083, China
  • Jie-Peng Yao Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
  • Zhuo Wang College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China Key Laboratory of Modern Precision Agricultural System Integration (Beijing), Ministry of Education, Beijing 100083, China
  • Zhong-Yi Wang College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China Key Laboratory of Modern Precision Agriculture System Integration Research, Ministry of Education, Beijing 100083, China; Key Laboratory of Agricultural Information Acquisition Technology, Ministry of Agriculture, Beijing 100083, China
  • Lan Huang College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China Key Laboratory of Agricultural Information Acquisition Technology (Beijing), Ministry of Agriculture, Beijing 100083, China

DOI:

https://doi.org/10.5755/j01.itc.53.2.35797

Abstract

Realistic scenarios produce labeled data and unlabeled data, however, there are significant challenges in labeling time series data. It is imperative to effectively integrate the relationship between labeled and unlabeled data within semi-supervised classification model. This paper presents a novel semi-supervised classification method, namely Data Augmentation-Fast Shapelet Semi-Supervised Classification, which employs a data augmentation module to enhance the diversity of data and improve the generalization ability of the model, as well as a feature fusion module to enhance the semi-supervised network. A conditional generative adversarial network is used to synthesize excellent labeled time series samples to enhance the homogeneous data in the sample space, the fast shapelets method is used to quickly extract the important shape feature vectors in the time series, self-supervised and supervised learning are combined to fully learn the unlabeled and labeled data of the time series dataset. The
joint loss function combines the loss functions of the two networks to optimize multiple objectives. Reinforcement learning is used to determine the weight coefficients of the joint loss function, at the same time, the reward function is modified to bias the supervisory loss, which improves the classification performance of the model under limited labeled data, and the model can also better achieve the semi-supervised classification task. The proposed method is validated on the UCR benchmark dataset, Electrocardiogram dataset, and Electroencephalogram dataset, the results show that the semi-supervised classification method can perform a more accurate semi-supervised classification of the time series, with an accuracy better than the comparison methods. Meanwhile, we use the plant electrical signal dataset obtained from actual measurements for testing, the visualization
analysis can clearly show the model role in the semi-supervised classification task, and the experimental results fully demonstrate the effectiveness and applicability of the proposed method.

Downloads

Published

2024-06-26

Issue

Section

Articles