Cloud removal from satellite images: why Nimbo does it better

The cloud cover is a major obstacle for Earth observation purposes. Nimbo’s exclusive cloud mask goes beyond state of the art to provide the clearest satellite views out there. Here’s how.

In the urgent fight for climate and the environment, we have only just begun to tap into satellite imagery’s incredible wealth of information to analyse, learn and act.

Yet manipulating satellite images is no small feat. This requires expertise, in remote sensing to interpret the data and in artificial intelligence to scale up the analysis. Plus, all over our good old planet, there are clouds. Lots of them, so that, more often than not, images are incomplete or simply of no use. The key through them lies in two words: cloud masking, a complex and subtle AI process which we have pushed beyond state-of-the-art performance to produce Nimbo’s satellite basemaps.

Unsatisfying cloud mask methods

While solutions to remove clouds from the sky do not exist yet (fortunately), there are ways to solve this issue on space imagery. Some have been out there for quite some time now. Our aim was to use one of them to develop a new methodology for pre-processing optical satellite images and automatically generate larce-scale satellite views, also know as mosaics, on a monthly basis. However, after researching and testing several cloud-free image generation methods through machine learning, we came to the conclusion that the state of the art did not provide a satisfying solution for our purpose.

Indeed, the various so-called “Best Pixel”-type methods delivered correct results on zoomed out views, but proved inconsistent when zooming in. These poor results meant that we were unable to carry out temporal tracking. In addition, this method is limited by the availability of images. If an area is under cloud cover for an entire month, there is now way you can produce a map out of it.

France’s National Center for Space Studies (CNES) also offers a monthly mosaic service. The technologies chosen and the development work carried out enabled us to produce maps based on satellite data, but those did not fit our purpose. Among other things, the cloud detection method was too slow, since it required time series of images over a period of more than a month. In addition, scaling up to continent level was near impossible as things stood.

We therefore undertook our own R&D work in order to design a solution that would 1) detect all clouds and 2) merge images without clouds to produce a map conveying the reality of observations. The ultimate aim being to produce cloud-free satellite maps of the world’s entire landmass, every month, with a quality never before achieved.

Using a neural network to detect clouds

To achieve better results, we chose to develop a new methodology from scratch, using a deep-learning solution. This implied setting up a neural network based on a U-Net architecture in order to classify satellite images and correctly detect clouds.

The first step was the constitution of a machine learning database, including a part dedicated to item validation. This was achieved through manual verification, in order to ensure that the samples retained were of the highest possible quality for optimal model training.

We then set about building our U-Net model, coupled with a testing phase, with the following output classes:

Cloud, shadow, land, water, snow, built-up, cirrus and “no data”
the above-mentioned classes merged by type (clouds with their shadows as well as cirrus, water/land/built-up, and finally snow).

Classifying other surface types was also necessary as some of these elements may be incorrectly labeled as clouds or shadows. This is obviously the case with snow, but also urban areas, which, depending on the sun angle, could mislead the AI model into classifying them as clouds, or water, often misinterpreted as shadow.

A probabilistic approach to cloud removal

An annoying issue with documented cloud masking processes is that they are binary. They are based on the assumption that either there is a cloud or there is not. Reality is far more complex, as illustrated by the case of semi-transparent clouds. Depending on situations and on the analysis’ purpose, a Geographic Information System (GIS) user might want them to stick in the picture. That is why we have chosen to ground Nimbo’s cloud removal method on a probabilistic approach.

Our plan was to directly use probability values for clouds, shadows or snow. These data were therefore not binary. This allowed us to add nuance and take into account areas detected as lightly cloudy, giving access to more information on what goes on on the ground.

Thanks to this approach, we are thus able to supply cloud masks enhanced with thresholds. With each ground pixel of the monthly cloud-free synthesis comes a probability of cloud density, which GIS users can adjust according to the level of opacity they are ready to live with for a given task – wisps of clouds that you can still see through, for instance.

A satellite view of a forest area in Peru, before and after Nimbo’s cloud removal (© Kermap)

Reliable, scalable satellite basemaps

This work that we have been conducting for two years has helped us build an ultra-efficient U-Net-type model to detect and remove clouds and shadows. And we are constantly improving it, as we visually identify elements whose classification proves problematic : salt lakes or muddy waters, incorrectly labeled as cloud and snow for instance. When this happens, we extract specific samples from the model’s output and feed them into our training database, the former having outperformed the latter.

Thanks to our new method, we managed to obtain images that are at once homogenous with one another, while retaining a natural look and bringing relevant information for land monitoring. When comparing Nimbo’s results to other basemap producers’ out there, we are quite satisfied with the finished product.

But for Nimbo to be time -and thus cost- efficient, the challenge was also to optimize our model so that it could be at once quick to deliver and able to achieve worldwide coverage. That we have done too, with processing time significantly outperforming other cloud masking models such as Maja or Fmask. As a result, we now have cloud masks ready and available for every Sentinel-2 image acquired over Europe since 2021. And more can be quickly produced upon request over any other area, for any month since 2018. Take your pick !