Publications by Marco Bondaschi

Attention with Markov: A Curious Case of Single-layer Transformers

Michael Christoph Gastpar, Martin Jaggi, Marco Bondaschi

Attention-based transformers have achieved tremendous success across a variety of disciplines including natural languages. To deepen our understanding of their sequential modeling capabilities, there is a growing interest in using Markov input processes to ...

ICLR2025

Local to global: Learning dynamics and effect of initialization for transformers

Michael Christoph Gastpar, Marco Bondaschi

In recent years, transformer-based models have revolutionized deep learning, particularly in sequence modeling. To better understand this phenomenon, there is a growing interest in using Markov input processes to study transformers. However, our current un ...

Curran Associates, Inc.2024

Batch Universal Prediction

Michael Christoph Gastpar, Marco Bondaschi

Large language models (LLMs) have recently gained much popularity due to their surprising ability at generating human-like English sentences. LLMs are essentially predictors, estimating the probability of a sequence of words given the past. Therefore, it i ...

IEEE2024

Alpha-NML Universal Predictors

Michael Christoph Gastpar, Marco Bondaschi

Inspired by Sibson’s alpha-mutual information, we introduce a new parametric class of universal predictors. This class interpolates two well-known predictors, the mixture estimator, that includes the Laplace and the Krichevsky-Trofimov predictors, and the ...

2022

A Revisitation of Low-Rate Bounds on the Reliability Function of Discrete Memoryless Channels for List Decoding

Marco Bondaschi

We revise the proof of low-rate upper bounds on the reliability function of discrete memoryless channels for ordinary and list-decoding schemes, in particular Berlekamp and Blinovsky's zero-rate bound, as well as Blahut's bound for low rates. The available ...

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC2022

Mismatched Decoding Reliability Function at Zero Rate

Marco Bondaschi

We derive an upper bound on the reliability function of mismatched decoding for zero-rate codes. The bound is based on a result by Komlos that shows the existence of a subcode with certain symmetry properties. The bound is shown to coincide with the expurg ...

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC2022

Zero-rate Reliability Function for Mismatched Decoding

Marco Bondaschi

We derive an upper bound on the reliability function of mismatched decoding for zero-rate codes. The bound is based on a result by Komlos that shows the existence of a subcode with certain symmetry properties. The bound is shown to coincide with the expurg ...

IEEE2021