It is well known that the prediction errors from principal component regression (PCR) and partial least-squares regression (PLSR) can be reduced by using both labeled and unlabeled data for stabilizing the latent subspaces in the calibration step. An approach using Kalman Filtering has been proposed to optimally use unlabeled data with PLSR. In this work, a sequential version of this optimized PLSR as well as two new PLSR models with unlabeled data, namely PCA-based PLSR (PLSR applied to PCA-preprocessed data) and imputation PLSR (iterative procedure to impute the missing labels), are proposed. It is shown analytically and verified with both simulated and real data that the sequential version of the optimized PLSR is equivalent to PCA-based PLSR.
Florent Gérard Krzakala, Lenka Zdeborová, Hugo Chao Cui
Nicolas Henri Bernard Flammarion, Aditya Vardhan Varre
Alexandre Caboussat, Dimitrios Gourzoulidis