Federated supervised principal component analysis

From National Research Council Canada

DOI	Resolve DOI: https://doi.org/10.1109/TIFS.2023.3326981
Author	Search for: Briguglio, William ORCID identifier: https://orcid.org/0000-0002-2357-3966; Search for: Yousef, Waleed A.ORCID identifier: https://orcid.org/0000-0001-9669-7241; Search for: Traore, Issa ORCID identifier: https://orcid.org/0000-0003-2987-8047; Search for: Mamun, Mohammad¹ORCID identifier: https://orcid.org/0000-0002-4045-8687
Affiliation	National Research Council of Canada. Digital Technologies
Funder	Search for: National Research Council (NRC) under the Digital Health and Geospatial Analytics; Search for: Digital Research Alliance of Canada’s
Format	Text, Article
Subject	federated learning; principal component analysis (PCA); privacy-preserving machine learning; supervised PCA
Abstract	In federated learning, standard machine learning (ML) techniques are modified so they can be applied to data held by separate participants without the need for exchanging said data and while preserving privacy. Other data modelling techniques, such as singular value decomposition, have been similarly federated, enabling federated principal component analysis (PCA), which is a popular preprocessing step for ML tasks. Supervised PCA improves on standard PCA by using labeled data to retain more relevant information for supervised ML problems. However, a federated version of supervised PCA does not exist in the literature. In this paper, we propose a federated version of supervised PCA and its dual and kernel variations, called FeS-PCA, dual FeS-PCA, and FeSK-PCA, respectively. We used random orthogonal matrix masking to keep FeS-PCA and dual FeS-PCA private, while FeSK-PCA was kept private using an approximation of the standard approach. We tested our proposed approaches by recreating visualization, classification, and regression experiments from the original unfederated supervised PCA paper. We further added a real-world federated dataset to test the scalability and fidelity of our approach. Our analysis and results indicate that FeS-PCA and dual FeS-PCA are faithful, lossless, and private versions of their unfederated counterparts. Furthermore, despite being an approximation, FeSK-PCA achieves nearly identical performance to standard kernel SPCA in many cases. This is in addition to the added benefit of a reduced runtime and smaller memory footprint.
Publication date	2023-10-23
Publisher	IEEE
In	IEEE Transactions on Information Forensics and Security 19 (23 October 2023): 646–660.
Language	English
Peer reviewed	Yes
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	f20c9608-5de0-4b15-8a94-9707de27d3c1
Record created	2023-11-22
Record modified	2023-11-22

Date modified:: 2024-11-03