Label-Efficient Deep Learning Methods for Event Detection, Object Segmentation and Cell Tracking in Biological Microscopy Images

Microscopy is one of the essential tools to quantify and advance our understanding of living systems. Modern digital microscopes enable measurements at unprecedented spatiotemporal scale, so automated processing methods are required to complete tasks such as object detection, segmentation and tracking, which enable further biological analysis. While such methods can be carefully engineered for specific problems and datasets, the paradigm of supervised machine learning offers an attractive alternative: Functions can be learned from paired input-output examples, e.g. images paired with manually created object masks for segmentation. This paradigm has revolutionized data processing, but relies on manually annotating considerable amounts of data for the problems to be solved, which is often the implementation and performance bottleneck in current bioimage processing algorithms. This thesis proposes three complementary approaches for using human annotations efficiently.

The first approach is to develop algorithms that are robust to variations in imaging conditions or biological samples, and thereby reducing the amount of manual annotations needed for novel datasets. Such domain gaps are particularly pronounced in volume electron microscopy (EM) images. To this end, I propose a simple and practical pipeline with a 3D convolutional network at its core to segment targeted subcellular structures, which only needs curated coarse annotations and successfully balances model capacity and model generalization to slightly different datasets. It exhibits promising generalizability across volume EM datasets, and employs efficient fine-tuning for improving model performance on previously unseen data. This method is employed to segment structures of different scales and complexity, for example mitochondria, endoplasmic reticulum, and nuclear pores.

The second approach is to train one unified large deep learning model that combines existing annotations from different domains efficiently for achieving a general task, which alleviates fine-tuning on new datasets. I propose a transformer-based model for cell tracking in live-cell microscopy that successfully approximates the discrete problem of linking cell detections across time frames for extracting ancestry lineages. This unified yet flexible approach is a state-of-the-art cell tracking algorithm for different imaging datasets without the need for parameter tuning, ranging from cell cultures with fluorescently tagged nuclei over bacteria colonies up to developing embryos imaged in 3D. Additionally, it enables robust cell tracking even at low frame rates.

The third approach is to develop self-supervised learning algorithms that can extract information from unannotated data, which represents the majority of acquired microscopy datasets. After presenting initial results on contrastive pre-training for instance segmentation, I propose a novel pre-training strategy for live imaging videos that uses time arrow predict