面向分布式系统标签噪声的时间序列分类方法
CSTR:
作者:
作者单位:

1. 上海理工大学 管理学院,上海 200093;2. 上海财经大学 信息管理与工程学院,上海 200433

作者简介:

通讯作者:

E-mail: fan_chj@163.com.

中图分类号:

TP391

基金项目:


Time series classification method for distributed system label noise
Author:
Affiliation:

1. Business School,University of Shanghai for Science and Technology,Shanghai 200093,China;2. School of Information Management and Engineering,Shanghai University of Finance and Economics,Shanghai 200433,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    时间序列数据广泛存在于工业、医疗等应用领域的分布式边缘设备中,由于其往往具备人类不可识别的特征,基于现实数据的时间序列分类任务中普遍存在数据“孤岛”和标注错误等问题.为解决分布式数据环境下这一困难,提出一种联邦时序过滤框架,该框架充分考虑自监督对比学习在提取复杂时序数据表征的优越性,并结合联邦学习方法来解决分布式系统的隐私安全问题,同时降低通信成本.首先,通过在服务器上维护一套基准样本,使用基于区别对比损失和预测对比损失的时序增强预监督策略,通过预训练-微调方法获得一个高泛化时间序列表征能力的预监督模型;然后,引入一种新的标签噪声过滤的方法,利用由预监督模型指导的伪标签与本地标注的标签协同过滤设备中的噪声数据,并将干净数据集用于全局模型的训练;最后,根据各种标签噪声下对框架进行有效性验证,验证不同基准数据比例对于所构造框架的影响,并通过消融实验验证预监督模型各损失的过滤效果.

    Abstract:

    Distributed edge devices in the industrial, healthcare, and other application fields frequently contain time series data. Due to the often unrecognizable features it possesses, there are common issues in time series classification tasks based on real-world data, such as `data islands' and labeling errors. To address this difficulty in distributed data environments, a federated temporal filtering framework is proposed. It incorporates the advantages of self-supervised contrastive learning in extracting complex temporal data representations and is combined with the federated learning approach to tackle the privacy and security issues of distributed systems, while also reducing the communication cost. By maintaining a set of benchmark samples on the server, this paper employs a time-series augmented pre-supervised strategy that relies on distinguishing contrast loss and predicting contrast loss. A pre-supervised model with a high-capacity for generalizing time-series characterizations is achieved through a pre-training and fine-tuning methodology in this approach. Meanwhile, a new approach for label noise filtering is introduced, which utilizes pseudo-labels guided by the pre-supervised model to filter the noisy data in the device in concert with local dataset labels, and uses the clean dataset for the training of the global model. Finally, this paper validates the framework's effectiveness across different types of labeling noise, examines the impact of varying baseline data ratios on the constructed framework, and confirms the filtering effects of each loss in the pre-supervised model through ablation experiments.

    参考文献
    相似文献
    引证文献
引用本文

林子谦,张坤,樊重俊,等.面向分布式系统标签噪声的时间序列分类方法[J].控制与决策,2024,39(12):4118-4126

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-11-20
  • 出版日期: 2024-12-20
文章二维码