Federated learning with uncertainty-based client clustering for fleet-wide fault diagnosis

Olga Fink, Hao Lu, Chao Hu
2024
Journal paper

Abstract

Operators from various industries have been pushing the adoption of wireless sensing nodes for industrial monitoring, and such efforts have produced sizeable condition monitoring datasets that can be used to build diagnosis algorithms capable of warning maintenance engineers of impending failure or identifying current system health conditions. However, single operators may not have sufficiently large fleets of systems or component units to collect sufficient data to develop data-driven algorithms. One potential solution to overcome the challenge of having limited representative datasets is to merge datasets from multiple operators with the same type of assets. However, directly sharing data across the company’s borders yields privacy concerns. Federated learning (FL) has emerged as a promising solution to leverage datasets from multiple operators to train a decentralized asset fault diagnosis model while maintaining data confidentiality. However, the performance of traditional FL algorithms degrades when local clients’ datasets are heterogeneous. The dataset heterogeneity is particularly prevalent in fault diagnosis applications due to the high diversity of operating conditions and system configurations. To address this challenge, this paper proposes a novel clustering-based FL algorithm where clients are clustered based on their dataset similarity. Estimating dataset similarity between clients without explicitly sharing data is achieved by training probabilistic deep learning models and having each client examine the predictive uncertainty of the other clients’ models on its local dataset. Clients are then clustered for FL based on relative prediction accuracy and uncertainty. Experiments on three bearing fault datasets, two publicly available and one newly collected for this work, show that our algorithm significantly outperforms FedAvg and a cosine similarity-based algorithm by 5.1% and 30.7% on average over the three datasets. Further, using a probabilistic classification model has the additional advantage of accurately quantifying its predictive uncertainty, which we show it does exceptionally well.

Official source

https://infoscience.epfl.ch/record/307388?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Federated learning with uncertainty-based client clustering for fleet-wide fault diagnosis

Graph Chatbot

Chat with Graph Search

ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference

Design of an Open-Loop Pile-Oscillation Program in the CROCUS Reactor

The Societal and Scientific Importance of Inclusivity, Diversity, and Equity in Machine Learning for Chemistry

ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference

The Societal and Scientific Importance of Inclusivity, Diversity, and Equity in Machine Learning for Chemistry

Design of an Open-Loop Pile-Oscillation Program in the CROCUS Reactor