Publication

Beyond fine-tuning: LoRA modules boost near-OOD detection and LLM security

Abstract

Under resource constraints, LLMs are usually fine-tuned with additional knowledge using Parameter Efficient Fine-Tuning (PEFT), using Low-Rank Adaptation (LoRA) modules. In fact, LoRA injects a new set of small trainable matrices to adapt an LLM to a new task, while keeping the latter frozen. At deployment, LoRA weights are subsequently merged with the LLM weights to speed up inference. In this work, we show how to exploit the unmerged LoRA’s embedding to boost the performance of Out-Of-Distribution (OOD) detectors, especially in the more challenging near-OOD scenarios. Accordingly, we demonstrate how improving OOD detection also helps in characterizing wrong predictions in downstream tasks, a fundamental aspect to improve the reliability of LLMs. Moreover, we will present a use-case in which the sensitivity of LoRA modules and OOD detection are employed together to alert stakeholders about new model updates. This scenario is particularly important when LLMs are out-sourced. Indeed, test functions should be applied as soon as the model changes the version in order to adapt prompts in the downstream applications. In order to validate our method, we performed tests on Multiple Choice Question Answering datasets, by focusing on the medical domain as a fine-tuning task. Our results motivate the use of LoRA modules even after deployment, since they provide strong features for OOD detection for fine-tuning tasks and can be employed to improve the security of LLMs.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.