Incremental improvements to Nightlights mapping thanks to R-Bloggers

My recent post Nightlights: cool data, bad geocoding highlighted some of the geocoding challenges Steve Mosher has been finding as he works with this interesting “light pollution” data set.

It was also my first article reposted on Tal Galili’s fantastic R-Bloggers site which I have been following for a while. But even better than the surge of new visitors were the great comments and suggestions posted by members of the community. In this post, I’m going to walk through each suggestion to illustrate just how generous and helpful this community can be.

Our starting point is where we ended up in my first post, using ggplot2 to display the raster nightlights data and map overlay:

library(RCurl)
library(R.utils)
library(rgdal)
library(raster)

url_radianceCalibrated = "ftp://ftp.ngdc.noaa.gov/DMSP/web_data/x_rad_cal/rad_cal.tar"
calibratedLights = "rad_cal.tar"
hiResTif = "world_avg.tif"

download.file(url_radianceCalibrated,calibratedLights,mode="wb")
untar(calibratedLights)
gunzip( paste(hiResTif, '.gz', sep='') )

hiResLights = raster( hiResTif )

# Eastern Mass., Cape Cod & Islands:
e = extent(-71.5, -69.5, 41, 43)

r = crop(hiResLights,e)
p = rasterToPoints(r)
df = data.frame(p)
colnames(df) = c("lon", "lat", "dn")

g = ggplot( data=df) + geom_point(aes(x=lon, y=lat, color=dn) )
g = g + scale_colour_gradient(low="black", high="white", trans="sqrt")
g = g + theme_bw() + xlim(c(-71.5,-69.5)) + ylim(c(41,43))
g = g + opts( axis.text.x = theme_blank(), axis.text.y = theme_blank() )
g = g + borders("state", colour="yellow", alpha=0.5)

Note the mismatch between the data and map overlay and the weirdness in the map where points are missing on the North Shore:

original data, positioning, and borders in ggplot2

Ben Bolker suggested a way to eliminate the artifacts which led me to this discussion on R-sig-Geo between Hadley Wickham and Paul Hiemstra which tipped me off to the existence of geom_path layer in addition to the geom_polygon layer which borders() usually produces. Polygons are closed but paths need not be, so that helps. And ggplot2’s map_data() function seems to grab the same data as borders():

b = map_data("state")

g = ggplot( data=df) + geom_point(aes(x=lon, y=lat, color=dn) )
g = g + scale_colour_gradient(low="black", high="white", trans="sqrt")
g = g + theme_bw() + xlim(c(-71.5,-69.5)) + ylim(c(41,43))
g = g + opts( axis.text.x = theme_blank(), axis.text.y = theme_blank() )
g = g + geom_path(data=b, aes(x=long,y=lat,group=group), colour="yellow", alpha=0.5)

Bonus: geom_path() obeys the “alpha=0.5” directive to set the transparency:

Worst artifacts solved by switching to map_data() and geom_path()

But Robert Hijmans really hit it out the park with two great suggestions. First, he pointed me towards a much, much better source of coastline data by using raster’s getData() function to grab data from the GADM database of Global Administrative Areas:

usa = getData('GADM', country="USA", level=0)

Level 0 will get you country boundaries, Level 1 for state/province, and so on. So we’ll lose state boundaries, but these files are pretty big to start with and can take a lot longer to plot.

Also, be warned: apparently somebody sinned against The Church of GNU, so you may need to run gpclibPermit() manually before running fortify() on the SpatialPolygonsDataFrame:

> f_usa = fortify(usa)
Using GADMID to define regions.

	Note: polygon geometry computations in maptools
 	depend on the package gpclib, which has a
 	restricted licence. It is disabled by default;
 	to enable gpclib, type gpclibPermit()

Checking rgeos availability as gpclib substitute:
FALSE 
Error: isTRUE(gpclibPermitStatus()) is not TRUE
> gpclibPermit()
[1] TRUE

With that hoop cleared, we can fortify() and plot this new layer:

f_usa = fortify(usa)

g = ggplot( data=df) + geom_point(aes(x=lon, y=lat, color=dn) )
g = g + scale_colour_gradient(low="black", high="white", trans="sqrt")
g = g + theme_bw() + xlim(c(-71.5,-69.5)) + ylim(c(41,43))
g = g + opts( axis.text.x = theme_blank(), axis.text.y = theme_blank() )
g = g + geom_path(data=f_usa, aes(x=long,y=lat,group=group), colour="yellow", alpha=0.5)

The coordinates are still shifted, but—wow—what a beautiful coast line. Cape Ann on the North Shore is really there now:

beautiful coastline data from GADM

Robert also points out an important mismatch in that GDAL returns the top left corner and resolution, so we could be off by a pixel or so. A quick call to xmax() and ymax() will fix this in our original raster:

xmax(hiResLights) = 180
ymin(hiResLights) = -90

r = crop(hiResLights,e)
p = rasterToPoints(r)
df = data.frame(p)
colnames(df) = c("lon", "lat", "dn")

g = ggplot( data=df) + geom_point(aes(x=lon, y=lat, color=dn) )
g = g + scale_colour_gradient(low="black", high="white", trans="sqrt")
g = g + theme_bw() + xlim(c(-71.5,-69.5)) + ylim(c(41,43))
g = g + opts( axis.text.x = theme_blank(), axis.text.y = theme_blank() )
g = g + geom_path(data=f_usa, aes(x=long,y=lat,group=group), colour="yellow", alpha=0.5)

Hey, that’s not bad at all:

final try, after adjusting GDAL pixel shift

Looking at the final version leads me to wonder how much of the geocoding problem is position, and how much is resolution/blurring/smearing. The lights of Provincetown, for instance, look pretty good. Maybe the blob is too north by a few pixels, but at least it’s well contained by land now. On Nantucket, the blur is half in the harbor. Then again, on Nantucket, most of the lights are right on the harbor, from the ferry terminal and running east to main street. So the lights are just about where they should be. Perhaps they’re just blurred and therefore spill into the harbor?

But the real point of the post is to highlight the generosity of this community. For that, thanks. And again: welcome R-Bloggers readers!

About these ads

8 Responses to “Incremental improvements to Nightlights mapping thanks to R-Bloggers”

  1. Tal Galili Says:

    What a kind (and cool) post you wrote, you just made my day! :)

    Good luck in your future posts.

    Best,
    Tal

  2. Steven Mosher Says:

    Wow,

    Robert is a life saver. That shift in nightlights has had me and a couple other guys stumped for a while. After looking at your stuff I wrote him and tried to explain the problem and one simple cal to GDALinfo explained it all.

    I’ve contacted the agency responsible for the data, we will see if they fix the metadata in the file. Anyway’s I’m starting to look at ggplot2 .

    • Jeffrey Breen Says:

      Hi Steven:

      Robert Hijmans is the man, indeed. (He and Jacob van Etten wrote the raster package, so I guess we shouldn’t be surprised… :)

      I am glad I could help push things forward.

      Jeffrey

  3. Steven Mosher Says:

    Oh, on the position/intensity of the lights,

    1. the date of the image is circa 1995 ish

    2. the true resolution is 2.7km which was splatted to 1km

    3. U might want to transform it into an actual radiance as opposed to the DN number. The file conatins the DN number which squashes the dynamic range a bit.

  4. Jan Verbesselt Says:

    Very nice post!

    However after trying the code here on R 2.12 on Mac OSx 10.6.4 and installing all the required packages. I am getting the following error after running the first ggplot statement
    > g = ggplot( data=df) + geom_point(aes(x=lon, y=lat, color=dn) )
    > g = g + scale_colour_gradient(low=”black”, high=”white”, trans=”sqrt”)
    > g = g + theme_bw() + xlim(c(-71.5,-69.5)) + ylim(c(41,43))
    > g = g + opts( axis.text.x = theme_blank(), axis.text.y = theme_blank() )
    > g = g + borders(“state”, colour=”yellow”, alpha=0.5)
    Error in fortify(map(map, region, plot = FALSE, fill = TRUE)) :
    could not find function “map”

    Are there other packages required? Or is there a problem with the installation of spatstat on Mac (I did obtain a lot of warnings when installing it).
    The rgdal package I have successfully installed via http://www.compmath.com/blog/2010/07/installing-package-on-mac-os-x/

    Thanks for your excellent post,Jan

  5. Jan Verbesselt Says:

    Owkay! I solved it…it took a while to install the packages but now:

    After adding
    require(maps)

    I all started working. :-) very nicely.

    Only this link did not work via R
    download.file(url_radianceCalibrated,calibratedLights,mode=”wb”)
    so I downloaded it myself.

    Any advice is welcome.
    Cheers,
    Jan


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 62 other followers

%d bloggers like this: