Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speedup MatchIt

I am running a matching procedure in R, using the MatchIt package. I use propensity score match, that is: estimate treatment selection by logit, and pick the nearest match.

The dataset is huge (4million rows), is there no way to speed it up?

To make it clear what I have done:

require(MatchIt)
m.out <- matchit(treatment ~ age + agesq + male + income + ..., data = data, metod = "nearest")
like image 443
Repmat Avatar asked Dec 10 '25 03:12

Repmat


1 Answers

I was similarly frustrated but found a solution for my case.

Essentially, I found a substantial run-time reduction by splitting the propensity score matching into 3 steps:

  1. Run the regression model and append the fitted values (i.e., your propensity scores) to your data.
  2. Trim your data columns down to only what you need: i.e., the unique record identifier and the appended propensity score. I saved the trimmed data to disk (not shown), but your implementation would likely still speed up if everything is kept in memory.
  3. Run matchit on the trimmed data with your propensity scores as a user-supplied distance, then join-back all the columns in your full original data.
library(MatchIt)
library(tidyverse)
library(dplyr)

#step 1
data$myfit <- fitted(glm(treatment ~ age + agesq + male + income + ..., data = data, family = "binomial"))

#step 2
trimmed_data <- select(data, unique_id, myfit, treatment)

#step 3
m.out <- matchit(treatment ~ unique_id, data = trimmed_data, method = "nearest", distance = trimmed_data$myfit)
matched_unique_ids_etc <- match.data(m.out, data = trimmed_data)
matched_unique_ids <- select(matched_unique_ids_etc, unique_id)
matched_data <- matched_unique_ids %>% inner_join(data)

The formula does not affect the nearest-neighbor matching process.

The default distance/link for matchit was glm/logit when I wrote this, so the code above is applicable to that case.

like image 114
TmB Avatar answered Dec 11 '25 20:12

TmB



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!