Insights

Providing Access to Energy with UNDP

Computer Vision
UNDP Solar

Context

Countries are raising their ambitions to meet the Paris Agreement Greenhouse Gas emission reduction targets through their Nationally Determined Contributions (NDC) while ensuring national development targets are met.

For example, take access to electricity. It is estimated that about six hundred million people in Africa lack access to energy, which poses a substantial obstacle to the daily activities of a large part of the population, which requires electricity to power their homes and businesses. Moreover, this situation presents a significant obstacle to the socio-economic development of various regions, greatly impacting the quality of life for millions of individuals. Access to electricity must be coupled with renewable energy sources while ensuring it reaches the vulnerable groups of people farthest from the grid. Solar mini-grids offer a cost-effective and sustainable solution to enhance energy access, especially in remote areas characterized by abundant sunlight exposure. 

One of the missing links is good data for planning, estimating, and monitoring climate change mitigation actions, which can be challenging to obtain due to a lack of institutional or technical capacity. Countries do not always have geo-coordinated information about where existing solar mini-grid sites are, or even less on rooftop solar home systems, adding another challenge to the planning process. The location of solar assets is one of the many pieces of information that can potentially be needed for the implementation of the NDC. Satellite imagery and machine learning (ML), as a form of artificial intelligence (AI), can bring various insights to improve planning and monitoring and accelerate climate action. 

United Nations Development Programme (UNDP) and Datatonic partnered in 2022-2023 to develop a Proof of Concept of how data on potential solar cells created with satellite imagery and ML can be useful in the regions most in need.

UNDP, through its flagship project, Climate Promise, has been supporting the enhancement and implementation of NDCs in over 120 countries. UNDP provides direct support to climate change mitigation and adaptation in over 110 countries, including support to catalyze mini-grids in over 20 countries on the African continent through the Africa Minigrid Programme. 

Datatonic, a leading company specialising in Machine Learning innovation, has partnered with the UNDP to pursue the objectives mentioned above. Datatonic has undertaken a project focused on the remote identification and cataloguing of solar panels across the African continent using satellite imagery. 

What is a mini-grid? A mini-grid, also known as a micro-grid or isolated grid, represents an off-grid system that involves small-scale electricity generation ranging from ~10 kW to 10 MW, serving a limited number of consumers through a distribution grid capable of functioning independently from national electricity transmission networks. Essentially, a mini-grid comprises interconnected small-scale electricity generators linked to a distribution network that supplies power to a localised group of households, operating autonomously from the national transmission grid. 

While inventories of large solar cell farms worldwide have been successfully compiled through similar efforts, and studies have showcased the feasibility of implementing small solar grids within European cities, work has yet to be undertaken on the African continent. The European models were not directly usable for two main reasons: it was built on high-resolution imagery from plane flyovers, and the context of the images was significantly different. To address these challenges, we aimed to adapt these models to the African context and use transfer learning to train models initially created with high-resolution aerial imagery to work with lower-resolution satellite imagery, which is the most readily available and scalable option for Africa.

This project, therefore, aimed to contribute to the advancement of geospatial AI capabilities by: 

  1. Developing a Proof of Concept algorithm to detect solar energy assets using satellite imagery in geographic areas with less data and different characteristics from high-income countries
  2. Contributing open source codes and process learnings in an open space for replication and reuse (Github)

Moreover, the project serves to inform future similar efforts to support countries in utilizing fast advancing Geospatial-AI technology for their NDC implementation, building a digital ecosystem within the countries as opposed to using off-the-shelf products. This blog serves as a compilation of the process taken and lessons learned. 

Existing Resources

A report on the state of the global mini-grid market [1] published in June 2019 by the Energy Sector Management Assistance Program (ESMAP) collected a dataset of over 19,163 installed and 7,507 planned mini-grids from 128 countries. They found that the majority of mini-grids in the dataset were between 10 and 100kW in size, with fewer than 10% of installed mini-grids being less than 10kW. Further, all of the planned mini-grids were greater than 10kW in size. 

This report appears to be the most contemporaneous collection of mini-grids available at this time. Despite this, local associations may be able to provide updated datasets of mini-grids. In Africa, the following associations may have this information: Africa Mini-grid Developers Association (AMDA) [2], African Mini-Grids [3], and Rural Elec [4]. OpenStreetMap [5] is a non-commercial crowd-sourced mapping effort that also includes solar installations.. 

Existing ML Solutions

Researchers from the University of Oxford, Descartes Labs and the World Resources Institute used Machine Learning to detect contiguous solar installations in excess of 10 KW using SPOT and Sentinel satellite imagery [6]. The authors developed a machine learning pipeline to identify installation locations and determine installation dates using two globally available satellite imagery sources: high-resolution (1.5-meter pixel resolution) 4-band SPOT-6/7 imagery and medium-resolution (10-meter pixel resolution) 12-band Sentinel-2 imagery. This pipeline leverages the strengths of both imagery types—the extensive spectral coverage of Sentinel-2, which effectively detects the spectral signatures of PV panels, and its frequent revisit rate, which allows for accurate installation date measurements. Additionally, the high-resolution SPOT imagery provides precise measurements of the installation footprint.. The authors’ model locates a total of 68,661 facilities for solar power generation globally.

We note that the majority of solar installations in the training set are located in Europe and the United States due to the availability of OpenStreetMap labels. Also, the model focuses only on installations in excess of 10KW due to their ability to be identified by satellite imagery (and by Sentinel spectral bands specifically). The model understandably performs best on large installations (greater than 10,000m2), achieving a precision and recall of 98.6% and 90%, respectively. The authors released their source code [7] and coordinates of detected installations [8] but have not yet made their trained model publicly available. 

Researchers from Stanford University, LMU München, and the University of Freiburg used aerial imagery of 10 cm/pixel resolution and 3-D building data to develop a system that performs better in detecting smaller solar installations [9], which is particularly relevant to our objective of detecting mini-grids in all. They use deep neural networks for image classification and segmentation, along with 3D spatial data processing techniques. This approach goes beyond current methods by also determining the azimuth and tilt angles of PV systems. They achieve a classification performance of 0.98 ROC-AUC. All related datasets and pre-trained models are available online [10].

Scope

The process of identifying solar cells across an entire continent requires extensive resources. Thus, the first step in this endeavour is to evaluate the feasibility of such a project and learn as much as possible on a small scale. To accomplish this, an initial Proof-of-Concept (PoC) solution was developed, involving the creation of a machine learning classification model capable of identifying solar installations, including both mini-grids and solar farms, by analysing satellite imagery. This preliminary phase serves to assess the viability of the project while also aiding in the identification of potential challenges or areas that require further analysis.

Approach

To initiate the process, the first crucial step was to delineate an Area of Interest (AoI) that satisfied specific requirements essential for evaluating the project’s feasibility. These requirements were as follows:

 

  1. The smallest surface area with a potentially high density of solar panels: This criterion ensured that the dataset would include a substantial presence of solar panels, enabling the Machine Learning (ML) model to accurately identify them using a supervised learning approach.
  2. Surface area with characteristics similar to African cities: Considering the ultimate goal of applying the ML model to African cities, it was vital to select an area that shared similar characteristics to ensure the model’s relevance and effectiveness.
  3. Availability of high-quality imagery: Imagery of superior quality, free from factors such as cloud cover, cloud shadows, reflections, and other visual distortions for the selected area had to be obtainable.

 

After conducting an assessment of regions with a notable density of solar panels, an AoI encompassing a combination of imagery from locations in the South of Spain and South Africa was chosen (see figure below).

Figure 1: Top countries in installed concentrated solar power (CSP) in 2021.

 

After the AoI had been agreed, UNDP obtained access to a substantial collection of over 40,000 satellite images (50GB in size) through its G-EGD partnership with the US State Department. These images were captured by the WorldView-3 satellite and encompassed the surface areas previously mentioned.

 

Figure 2: Highlighted areas belonging to the newly redefined AoI.

Image pre-processing

Before utilising the available imagery for training a classification model to identify solar installations, the images underwent a preprocessing phase. This process involved the application of several techniques, namely splitting, filtering, and pan-sharpening. 

Figure 3: Pre-processing steps.

 

To facilitate input into the ML model, the images needed to be divided into tiles of an appropriate size. As illustrated in the diagram below, using huge tiles would result in solar panels appearing too small and challenging for the model to identify. Conversely, smaller tiles than optimal may not encompass the entirety of a solar panel, making detection more difficult. A final tile size of 80m by 80m was chosen, allowing for easy human navigation on a standard 24-inch screen to facilitate the labelling process.

Figure 4: different tile sizes (left to right, 30m x 30m, 50m x 50m, 70m x 70m).

 

The subsequent stage of the preprocessing pipeline involved a filtering approach aimed at reducing the original dataset to a subset dataset enriched with images of the positive class. This reduced the size of the dataset significantly, to a number of images we could hope to label within the time budget available. This involved querying the OpenStreetMaps API for building footprints and retaining only those tiles that intersected with at least one building footprint.

Figure 5: OpenStreetMaps footprints used to filter the AoI.

 

Subsequently, the images underwent a sharpening process. The original images comprised a high-resolution panchromatic band (depicted in the left image of Figure 6) and a multispectral band (displayed in the middle image of Figure 6). Pansharpening, also known as panchromatic sharpening, was employed to obtain a high-resolution colour image. This technique involved fusing the two bands using a deep learning approach, which outperforms traditional interpolation methods. To achieve this, a pre-trained model specifically designed for WorldView-3 imagery was leveraged [19]. By doing so, spectral and spatial loss at the tile level was minimised, requiring only a few training steps.

Figure 6: from left to right, panchromatic band, multispectral bands, and pansharpened image.

 

Building a labelling platform

In addition to acquiring high-quality imagery, another crucial requirement for the project was obtaining a sufficient number of labelled images for training and evaluating the model.

To address this need, Datatonic implemented a labelling platform utilising two open-source tools: CVAT and FiftyOne. The platform was deployed on Google Cloud and properly configured to grant UNDP volunteers access to the imagery.

 

Imagery labelling

With the aforementioned labelling platform in place, UNDP volunteers were able to utilise a set of labelling tools to identify solar installations within their assigned images.

Over two weeks, a group of 9 United Nations volunteers successfully labelled the entire dataset, identifying the presence of solar installations by outlining individual polygons.

Model development

Despite performing the previous pre-processing steps, we encountered a significant data quality challenge: the images we had received had been captured at an off-nadir angle, resulting in lower resolution than anticipated. Instead of the expected 30 cm/pixel resolution, the effective resolution in our final dataset was closer to 50 cm/pixel. This lower resolution posed a difficulty in identifying small solar cells visually. Thus, our PoC aimed to determine if an ML model could successfully detect solar panels even when they were not easily discernible by human observers.

In this phase, we used transfer learning to adapt the model in [11] that had been previously trained to identify Photovoltaic (PV) installations on rooftops in Germany. The model was fine-tuned using our own imagery. This approach was selected because leveraging pre-trained models can be highly advantageous when working with data that may not possess the required quality. As the original model had been trained on imagery with a higher resolution of 10 cm/pixel, we retrained the model using downscaled training imagery similar in resolution to the imagery available to us. Further details regarding the specifications of the model can be found below.

 

  • Architecture: InceptionV3
  • Framework: PyTorch
  • Image resolution: 50 cm/pixel
  • Image size: 160x160p (80x80m)
  • Dataset size: ~40,000 images

 

On the original test dataset from Germany, the open-source model had a very strong performance of 0.98 as measured using the Area-Under-the-Curve (AUC) of the Receiver-Operating Curve (ROC), a metric that ranges from 0 to 1, with 1 being a perfect classifier that correctly identifies every single image with solar cells present. After we tuned the model with the downscaled training dataset from Germany to prepare it for use on our lower-resolution imagery, the performance dropped to 0.66, which shows that a difference in resolution from 10cm to 50cm increases the complexity of the prediction tasks significantly. 

Next, we applied the model fine-tuned for lower-resolution imagery and tested it on our test dataset from Spain and South Africa. The performance of 0.5 AUC-ROC was very poor, equivalent to choosing at random. This showed us that the aerial imagery from Germany was too different from the satellite imagery we were planning to use for the model to rely on the same pattern recognition. Fine-tuning with our own labels and satellite imagery collected for Spain and South Africa was required. With fine-tuning, we managed to match the performance of the model on the downscaled German dataset of 0.66 AUC-ROC. This shows that we can substitute satellite imagery for aerial imagery at comparable resolution without a significant drop in performance.

Model performance

The performance of the fine-tuned model is lower than what we require to build up a reliable inventory of mini-grids. With this model, we would be able to locate about 70% of all solar cells, but with a very high false positive rate where only every fifth positive prediction is a true positive. Our results therefore show that (1) image resolution is the key factor in determining performance, and we require a resolution higher than 50cm for the task at hand, (2) transfer learning with models trained on higher-resolution imagery is a viable approach for this problem when the training set available is insufficient to train good models from scratch.

The figure below illustrates a few prediction examples of the best-performing model we trained, displaying their associated prediction scores. Figure 8 shows our estimates of how model precision can be expected to scale with imagery resolution. We expect that at a recall of 0.9 (% of solar cells detected), the precision (% of solar cells correctly predicted) using World-View3 satellite imagery that has undergone the proprietary super-resolution processing by Maxar to increase the resolution to 15cm, the achievable precision is in the range of 0.4-0.6. 

Figure 7: examples of True Positives, False Positives, False Negatives and True Negatives, with their corresponding prediction score.

 

Figure 8: Estimation of the relationship between model precision and imagery resolution (recall of ~90-95%)

Next steps

Our conclusion from the PoC is that it is possible to produce an inventory of sufficient quality for solar mini-grids in Africa using satellite imagery, but only with the highest resolution imagery available on the market. As the next steps in this project, we aim to work with UNDP to acquire higher resolution images, label additional images to improve our training set, train a new model and build up a processing pipeline to create the mini-grid inventory at scale.

References

[1] https://openknowledge.worldbank.org/handle/10986/31926

[2] https://africamda.org/publications/

[3] https://www.africanminigrids.com/

[4] https://www.ruralelec.org/

[5] https://www.openstreetmap.org/

[6] Kruitwagen, L., Story, K.T., Friedrich, J. et al. A global inventory of photovoltaic solar energy generating units. Nature 598, 604–610 (2021). https://doi.org/10.1038/s41586-021-03957-7

[7] https://github.com/Lkruitwagen/solar-pv-global-inventory

[8] https://zenodo.org/record/5005868#.YeWU4_7P0Q8 

[9] Mayer, K., Rausch, B., Arlt, M.L., Gust, G., Wang, Z., Neumann, D., Rajagopal, R. 3D-PV-Locator: Large-scale detection of rooftop-mounted photovoltaic systems in 3D. Applied Energy 310, 118469 (2022). https://doi.org/10.1016/j.apenergy.2021.118469

[10] https://github.com/kdmayer/3D-PV-Locator

[11] https://github.com/matciotola/Z-PNN

Related
View all
View all
Partner of the Year Awards
Insights
Datatonic Wins Four 2024 Google Cloud Partner of the Year Awards
Women in Data and Analytics
Insights
Coding Confidence: Inspiring Women in Data and Analytics
Prompt Engineering
Insights
Prompt Engineering 101: Using GenAI Effectively
Generative AI