Robert West, Caglar Gulcehre, Giuseppe Russo
A rapidly growing number of applications rely on a small set of closed-source language models (LMs). This dependency might introduce novel security risks if LMs develop self-recognition capabilities. Inspired by human identity verification methods, we prop ...
Association for Computational Linguistics2024