Apportioning data from one geography to another

Question

We have two geographies: census tracts and a squared grid. The grid dataset only has information on population count. We have information on the total income of each census tract. What we would like to do is to apportion these income data from the census tracts to the grid cells.

This is a very common problem in geographical analysis and there're probably many ways to address it. We want to do this considering not only the spatial overlap between census tracts and grid cells but also considering the population of each cell. This is mainly to avoid problems when there is a large census tract that may contain people living only in a small area.

We present below a reproducible example (using R and the sf package) and the solution we've found to this problem so far, using a sample we extracted from our geographies. We would appreciate to see if others have alternative (more efficient) solutions to check if our results are correct.

library(sf)
library(dplyr)
library(readr)

# Files
download.file("https://github.com/ipeaGIT/acesso_oport/raw/master/test/shapes.RData", "shapes.RData")
load("shapes.RData")

# Open tracts and calculate area
tract <- tract %>%
  mutate(area_tract = st_area(.))

# Open grid squares and calculate area
square <- square %>%
  mutate(area_square = st_area(.))


ui <-
  # Create spatial units for all intersections between the tracts and the squares (we're calling these "piece")
  st_intersection(square, tract) %>%
  # Calculate area for each piece
  mutate(area_piece = st_area(.)) %>%
  # Compute the proportion of each tract that's inserted in that piece
  mutate(area_prop_tract = area_piece/area_tract) %>%
  # Compute the proportion of each square that's inserted in that piece
  mutate(area_prop_square =  area_piece/area_square) %>%
  # Based on the square's population, compute the population that lives in that piece
  mutate(pop_prop_square = square_pop * area_prop_square) %>%
  # Compute the population proportion of each square that is within the tract
  group_by(id_tract) %>%
  mutate(sum = sum(pop_prop_square)) %>%
  ungroup() %>%
  # Compute population of each piece whitin the tract
  mutate(pop_prop_square_in_tract =  pop_prop_square/sum) %>%
  # Compute income within each piece
  mutate(income_piece = tract_incm* pop_prop_square_in_tract)

# Final agreggation by squares
ui_fim <- ui %>%
  # Group by squares and population and sum the income for each piece
  group_by(id_square, square_pop) %>%
  summarise(square_income = sum(income_piece, na.rm = TRUE))

Thank you!

chris prener · Accepted Answer

Depending on the approach to interpolation you want to use, I may have a solution for you that I've helped develop. The areal package implements areal weighted interpolation, and I use it in my own research from interpolating between U.S. census geography and grid squares. You can check out the package's website (and associated vignettes) here. Hope this is useful!

Apportioning data from one geography to another

Tags:

r

geospatial

Kauê Braga

1 Answers

chris prener

Recent Activity

Donate For Us

Apportioning data from one geography to another ​

Tags:

r

geospatial

Kauê Braga

1 Answers

chris prener

Related questions

Recent Activity

Donate For Us

Apportioning data from one geography to another