Adaptive Test-Time Learning Under Domain Shift: From Augmentation to Prolonged Adaptation

Deep neural networks often fail to generalize when deployed in dynamic and unpredictable real-world environments, where data distributions at test time can differ significantly from those encountered during training. Since simulating all possible distribution shifts during supervised training is computationally impractical, this dissertation investigates how to adapt vision-based models at inference time, without requiring labeled test data or access to the original training set, within the paradigm of source-free test-time learning.

The thesis first explores test-time adaptation of inputs rather than models. For vision tasks like MRI segmentation with contrast-based domain shifts, we show targeted image transformations effectively reduce training-test distribution gaps. Our approach learns image augmentations via unsupervised loss functions to simulate training image styles without accessing the original dataset. This gradient-driven method improves performance when input alignment is critical but parameter updates are impractical. However, for real-world deployment with continuously evolving data distributions, dynamic test-time adaptation offers a more robust solution by actively adjusting model parameters to changing conditions.

Extending this idea, the thesis develops a self-distillation framework that tackles more complex domain shifts that simple image transformations alone cannot resolve, such as staining differences in histopathology or lighting variations in outdoor scene segmentation. By strategically crafting adversarial augmentations that exaggerate domain shifts, the model is trained to remain consistent through pseudo-label supervision provided by a corresponding mean teacher model. This results in significantly enhanced robustness under severe distributional changes, enabling reliable performance in more challenging real-world scenarios.

Nonetheless, most existing test-time adaptation methods fail in realistic deployment settings where the data distribution evolves gradually and samples exhibit temporal correlations. To address this, the thesis introduces a novel test-time normalization recalibration strategy. By disentangling and dynamically updating skewed batch statistics through an online unmixing mechanism, our method effectively tracks evolving data distributions using a similarity-based clustering of incoming test instances. This leads to more stable and reliable model adaptation over time, particularly in environments where the test data is sampled correlatively and conditions shift progressively.

Finally, the thesis presents a plug-in framework for prolonged test-time learning in continuously evolving environments. Unlike conventional single-model approaches that suffer from catastrophic forgetting and inter-domain interference, our framework employs adaptive online clustering based on domain style features to detect and track evolving target domains, enabling domain-specific adaptation.

These contributions significa