Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The validation of long-term cloud data sets retrieved from satellites is challenging due to their worldwide coverage going back as far as the 1980s. A trustworthy reference cannot be found easily at every location and every time. Mountainous regions present a particular problem since ground-based measurements are sparse. Moreover, as retrievals from passive satellite radiometers are difficult in winter due to the presence of snow on the ground, it is particularly important to develop new ways to evaluate and to correct satellite data sets over elevated areas. In winter for ground levels above 1000m (a.s.l.) in Switzerland, the cloud occurrence of the newly released cloud property data sets of the ESA Climate Change Initiative Cloud_cci Project (Advanced Very High Resolution Radiometer afternoon series (AVHRR-PM) and Moderate-Resolution Imaging Spectroradiometer (MODIS) Aqua series) is 132 to 217% that of surface synoptic (SYNOP) observations, corresponding to a rate of false cloud detections between 24 and 54 %. Furthermore, the overestimations increase with the altitude of the sites and are associated with particular retrieved cloud properties. In this study, a novel post-processing approach is proposed to reduce the amount of false cloud detections in the satellite data sets. A combination of ground-based downwelling longwave and shortwave radiation and temperature measurements is used to provide independent validation of the cloud cover over 41 locations in Switzerland. An agreement of 85% is obtained when the cloud cover is compared to surface synoptic observations (90% within +/- 1 okta difference). The validation data are then co-located with the satellite observations, and a decision tree model is trained to automatically detect the over estimations in the satellite cloud masks. Crossvalidated results show that 62 +/- 13% of these overestimations can be identified by the model, reducing the systematic error in the satellite data sets from 14.4 +/- 15.5% to 4.3 +/- 2.8 %. The amount of errors is lower, and, importantly, their distribution is more homogeneous as well. These corrections happen at the cost of a global increase of 7 +/- 2% of missed clouds. Using this model, it is possible to significantly improve the cloud detection reliability in elevated areas in the Cloud_cci AVHRR-PM and MODIS-Aqua products.