Tracing from Google’s Open Buildings dataset

Comparing this CC-BY dataset with OpenStreetMap

Nick Doiron
4 min readAug 1, 2021

Google recently published a dataset of 516 million buildings traced from satellite imagery covering across Africa. Unlike their usual map data, this has a Creative Commons Attribution license, meaning it can be reused with credit. OpenStreetMap has a requirement that owners of CC-BY data specifically fill out a waiver. Updated Sep 16, 2021: Google has released the buildings also under the Open Database License (ODbL) but there is a known 10% error; please manually review.

Edit Aug 1, 2021: it appears Google intentionally left Libya, Cameroon, Chad, Gabon, South Sudan, Mali, Morocco, Western Sahara, and parts of Somalia, Mozambique, Sudan, and Ethiopia out of the downloads. Sorry for overlooking the difference between their paper and the download page.
For future updates:
this GitHub Readme

Parsing the data

The dataset is divided into several CSVs with the building footprint in Well Known Text (WKT) format:

latitude,longitude,area_in_meters,confidence,geometry,full_plus_code2.96283507,30.87489292,41.2623,0.8782,"POLYGON((30.8749202634832 2.96280455119714, 30.874925633046 2.96285967688244, 30.8748655898075 2.96286558849087, 30.874860220248 2.96281046280741, 30.8749202634832 2.96280455119714))",6GJGXV7F+4XJG0.45898762,32.61744969,25.3538,0.8084,"POLYGON((32.6174811987239 0.45898121919608, 32.6174506687015 0.459019966803207, 32.6174181844152 0.458994027306467, 32.6174487144376 0.458955279700022, 32.6174811987239 0.45898121919608))",6GGJFJ58+HXWF

The regional CSV had about 20.4 million buildings (5GB unzipped).

I decided to manually review buildings near the Ugandan school where I taught with the Kasiisi Project and One Laptop per Child in summer 2010. I ran this Python script to select buildings with a lat/lng center within 0.015 degrees (~1 mile radius) of a focus point, and convert them into a GeoJSON for reference in the JOSM editor.

OpenStreetMap screenshots of two affiliated schools

Current status

Many buildings in the Kasiisi area were traced by Bert Araali about 6 years ago. We should compare them and Google’s data.
The village of Kanyawara was mostly untraced (see satellite image at right) and the school had a misspelled name on WikiData.

In the nearby city of Fort Portal, there are edits from Ugandans affiliated with local universities — Mbabani Allan, Ebong C137, plus I remember a user from Mountains of the Moon or Makerere University but could not find them again today.

How does it shape up?

The script found 1,444 buildings (more than I’d expected) making a 397kb GeoJSON.

I was dissatisfied by my first comparison. At the Kasiisi School, the nursery school building on the left side of the image had a foundation in 2010 and was open by mid-2012. Other buildings aren’t aligned.

Two screenshots of OSM vs. Google

The alignment and presence of buildings seemed to match the fuzzy Mapbox Satellite layer, much more than Bing. This gives us a hint what resolution and vintage of imagery is closer to what the model saw. [edit: the paper says they used 50cm imagery]
Compare the two images below:

Mapbox Satellite Imagery (above)
Bing Imagery (2nd)

If this were given to me to QA in HOT Task Manager, I would not accept the work. Overall this was disappointing. I would like to get an AOK from Google and then add missing parts of less-mapped villages, but there would need to be significant error-checking.
I don’t post this to crap on the team’s work. This was a huge scale project to train an ML model, run it on a whole continent of imagery, and publish all of those buildings. I didn’t try filters such as Google’s confidence value for the buildings. This part of Uganda is under-imaged and could get much better with more training and imagery. I also didn’t evaluate this in urban areas where human annotation is particularly difficult in close quarters.

Comparison to RapiD

A friend reminded me that a similar layer exists through Facebook’s RapiD.
I zoomed into the same area for a closer look.
I see a little less weirdness but still some buildings missing when compared to Maxar (which is usually the latest imagery). If I saw this on HOT Task Manager, I could see accepting this after filling in these buildings.

Maxar imagery, RapiD suggestions

One final thing,

Please consider making an impact with a donation to the Kasiisi Project, MapUganda, or the Humanitarian OpenStreetMap Team.

--

--

Nick Doiron
Nick Doiron

Written by Nick Doiron

Web->ML developer and mapmaker.

No responses yet