Visualizing Population Density Across The Great Lakes Region in 2023

  ยท   3 min read

Introduction #

A good visualization will be able to answer a question without words.

In my practical data analytics class, we were given a small individual project: create a large visualization. In our class, we had been spending time going over different design philosophies and ways of thinking about data visualization. Creating good visuals can not only allow for an eye-catching graphic, but may be able to answer questions that the viewer did not know they had. For my project, I wanted to include a spatial component, so I began to search for a dataset that would create a nice looking map. After a few hours, I came across the Kontur population density data. This dataset divided the world into hexagons, each 400 meters in spatial resolution. This is an unusably high resolution for spatial data, so I decided to narrow the output to my favorite region of the United States - The Great Lakes!

Parsing Files #

The files are downloaded in .gpkg format. This is an array-like structure that is capable of storing a lot of spatial information, but are not suitable for plotting or inputting into other programs. Therefore, I first started with opening the file with geopandas, a very powerful python library for spatial data wrangling.

gdf = gpd.read_file('pop23.gpkg')
gdf = gdf.set_geometry("geometry")
gdf = gdf[[col for col in gdf.columns if col != "geometry" or col == gdf.geometry.name]]

After importing, there was a column h3 that was not going to be helpful, so I dropped it. I also made sure to take care of any missing values, as that would have caused large problems later on.

gdf = gdf.drop(columns="h3")
gdf["population"] = gdf["population"].replace(0, np.nan)

Now that I had a geoDataFrame containing only hexagons and their population densities, it was time to filter out only the states I wanted to visualize. I selected The Great Lakes region as I figured the population density would be somewhat unique compared to the more grid-like West coast, and not as compact and dense as the East coast. I also made sure to reproject the state boundaries into the coordinate system used by the population density data so the clip would run smoothly.

us_states = gpd.read_file('states.shp')
states_of_interest = us_states[us_states['name'].isin(['Michigan', 'Ohio', 'Illinois', 'Indiana', 'Wisconsin', 'Minnesota'])]
states_of_interest = states_of_interest.to_crs(gdf.crs)
filtered_gdf = gpd.clip(gdf, states_of_interest)

Once I had my filtered dataFrame, it was very straightforward to save to a shapefile and read into blender.

filtered_gdf.to_file('test.shp', driver='ESRI Shapefile')

Blender #

For the actual visual components, I utilized Blender, a tremendous open source 3D modeling program. The addon Blender GIS allowed me to import the shapefile and set the population value to the z-axis, meaning that the higher the population density, the taller that hexagon was. After importing, I made sure to pick my materials and lighting carefully. I knew I wanted green, as maps tend to look strange when viewed in other colors (seriously, try it!). After some trial and error, I got a good looking map out of it. It is important to be mindful of the lighting, as too much shadow will make it unreadable, while too little will take away the depth. After a grueling three hour render, I was left with an extremely high resolution image.

Photoshop #

In Photoshop, I found a clean serif font and began to label the visual. I decided to include chunks of zoomed-in features along with their maximum value to help the viewer ease into the visual. I also wanted to highlight the cities I have been to and liked! After some labelling, I tossed together a legend and some minor adjustments.

Overall, I really enjoyed doing this type of creative visual, and I think I am going to keep my eye out for more datasets that could benefit from being graphed in such a way. Thank you for checking it out!