- Data for Good at Meta is open-sourcing the data used to train our AI-powered population maps.
- We’re hoping that researchers and other organizations around the world will be able to leverage these tools to assist with a wide range of projects including those on climate adaptation, public health and disaster response.
- The dataset and code are available now on GitHub.
To support the ongoing work of researchers, governments, nonprofits, and humanitarians around the world, the Data for Good at Meta program is open-sourcing the first set of training data and sample code used to construct Meta’s AI-powered population maps.
As the world looks towards the increasing threat of climate change, Meta’s AI-powered population maps, and the data behind them, offer significant opportunities to direct investments in disaster preparedness through improved estimation of global flood exposure and in climate adaptation planning.
By open sourcing these tools, we hope that other researchers can generate new insights for speeding the delivery of sustainable energy and climate resilient infrastructure around the world.
Why we need better population maps
Accurate estimates of population are taken for granted in many countries. Governments in advanced economies can rely on a variety of sources including tax records or census datasets to better estimate their population and make informed decisions on the delivery of services. However, in other parts of the world, accurate population data is hard to come by. In certain low- and middle-income countries, the most recent census may have been conducted decades ago or lack accurate representation of vulnerable populations. Furthermore, estimates between censuses are often fraught with inaccuracies and remote populations may be entirely missing from official sources. As a result, uncounted communities may live outside the reach of critical programs.
To combat this challenge, Meta began the process of mapping the world’s population using artificial intelligence and satellite imagery in 2017. Alongside other leading population mapping institutions like Columbia University’s Center for Earth Science Information Network (CIESIN) and WorldPop at the University of Southampton, we have openly published hundreds of high resolution population maps and datasets. These have been used around the world by governments and nonprofits for social programs ranging from the targeting of COVID-19 interventions to the delivery of clean water. As the world’s natural resource and energy demands scale, accurate population estimates also offer significant opportunities to improve sustainability efforts.
Background on Meta’s AI-powered population maps
Data for Good’s AI-powered population maps estimate the number of people living within 30-meter grid tiles in nearly every country around the world. These maps leverage computer vision techniques – similar to those leveraged to identify objects in photos for the visually impaired – to identify human-made structures in satellite imagery. The outputs of Meta’s AI model are then combined with population stock estimates from CIESIN to approximate the number of people living in each tile.
In addition to total population counts, Meta’s population maps also include demographic breakdowns for groups such as the number of children under five, women of reproductive age, youth, and the elderly.
AI-powered population estimates have been scientifically evaluated to be among the most accurate in the world for mapping population distribution for a variety of geographies and use-cases. For example, this 2022 paper by researchers at the University of Southampton and University of Ghana in Nature – Scientific Reports compares various population density estimates for use in mapping flooding risk in west Africa. Other studies have investigated a variety of use-cases such as mapping landslide risk and malaria eradication; and a range of countries including Haiti, Malawi, Madagascar, Nepal, Rwanda, and Thailand.
Open-sourcing training data for our AI population maps
This initial set of training data consists of almost 10 million labels for over 126 gigabytes of satellite imagery and includes human labels on these satellite imagery patches indicating if a building is present. These labels were created on satellite imagery dating from 2011 – 2020; however, even labels made on older imagery are useful to train the next generation of machine vision models (like Meta’s Segment Anything) to more accurately identify buildings in a range of land-cover environments. In addition to this first batch, we plan to release additional data and code for computer vision training in the future.
Open sourcing Meta’s training data and code allows population mapping partners like CIESIN and WorldPop to continue the progress made in the last decade. These tools reduce development costs for research units to generate even more accurate population estimates and also allows researchers working on building detection to improve their methods, especially when combined with more recent satellite imagery. Future data released from CIESIN and data collaborations like GRID3 will continue to push boundaries of spatial resolution and accuracy as the result of their work collaborating with many African countries to generate, validate, and use core spatial datasets in support of sustainable development.
To better visualize village settlement locations and calculate service coverage, World Vision turned to an innovative dataset developed by Meta’s Data for Good (D4G) and Columbia University’s Center for International Earth Science Information Network (CIESIN). The resulting High Resolution Settlement Layer (HRSL) has been a game-changer for visualizing the geography of clean water.
–Allen Hollenbach, Technical Director for World Vision Water and Sanitation
Applications in sustainable electrification, clean water, and climate change adaptation
Nonprofit organizations and governments around the world have already leveraged Meta’s AI-powered population maps for a range of social impact programs, including the World Bank’s rural electrification efforts in Somalia and Benin and similar efforts in Uganda by the World Resources Institute.
World Vision has also used these datasets in accelerating the progress in five-year plans for water and sanitation in places like Rwanda and Zambia and just recently announced having reached one million additional Rwandans with clean water using insights from these maps to track progress towards universal water coverage.
Innovation in global population mapping is only possible through the type of collaboration Meta continues to have with Columbia University and WorldPop and a shared commitment to open source enables researchers and governments around the world to participate in this process.
Please visit the Data for Good website for more information about Meta’s Data for Good program. And please visit this blog for more information about how we protect user privacy in our tools.
Acknowledgements
We’d like to thank our external collaborators: Professor Andy Tatem, Director of WorldPop at University of Southampton, UK; and Greg Yetman, Associate Director for Geospatial Applications at CIESIN, Columbia University, and for their partnership and support on this work.