Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The goal of habitat suitability mapping is to predict the lo-cations in which a given species could be present. This is typically accomplished by statistical models which use envi-ronmental variables to predict species observation data. The relationship between the environmental characteristics of a location and the species that live there is likely to be quite complex, so deep learning models would seem natural to use. In practice, there are biases in the training data which present obstacles to standard deep learning approaches. First, large-scale species observation collections typically consist of presence-only data, which means we only have locations where a species has been observed (not where it has been confirmed to be absent). Second, the class distribution tends to be long-tailed. In this work we examine training tech-niques to mitigate these challenges: (i) a method for sharing species information between nearby observations and (ii) a curriculum learning strategy to reduce class imbalance early in training. These methods enable us to outperform state-of-the-art results on the GeoLifeCLEF 2020 dataset and suggest fruitful directions for future work.
Christophe Ancey, Mehrdad Kiani Oshtorjani
Francesco Mondada, Alexandre Massoud Alahi, Vaios Papaspyros