Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Out of bound error when trying to combine plots using data.table

Tags:

r

data.table

I'm trying to create an combined plot by first creating sub-plots grouped by sub-categories and then combining them into a single plot grouped by a parent category.

A MWE is below. Here, I'm trying to plot mpg for weight by first grouping by carb and am, and then aggregating all the child plots under each parent am value into a single plot by grouping by am alone. The desired final output is a single plot for each value of am composed of sub-plots for individual values of carb.

However, I keep running into an index bounds error.

Here patchwork::wrap_plots accepts "multiple ggplots or a list containing ggplot objects".


mtDT <- setDT(copy(mtcars))

mtDT[, .(plot = list(ggplot(data = .SD) + geom_point(aes(x=wt, y=mpg)))), by=.(carb,am)][,
.(list(patchwork::wrap_plots(plot))), by=am]

> Error: Index out of bounds

I also tried the following approach, but that threw a type error about inputs to wrap_plots not being ggplots.

mtDT[, .(plot = list(ggplot(data = .SD) + geom_point(aes(x=wt, y=mpg)))), by=.(carb,am)][,
  .(plotlist = list(plot)), by=am][,
   comboplot:=patchwork::wrap_plots(plotlist)]

>Error: Only know how to add ggplots and/or grobs

I'm aware of facets in ggplot, but I do it this way for finer control of plots.

like image 447
Paul Avatar asked Dec 12 '25 16:12

Paul


1 Answers

I don't fully understand why this happens but this appears to be a bug in data.table. It results from using lapply where I would use a for loop (because no return value is needed). I suggest you report this to the data.table issue tracker.

This is the offending call within [.data.table:

tmp <- mtDT[, .(plot = list(ggplot(data = .SD) + geom_point(aes(x=wt, y=mpg)))), 
     by=.(carb,am)]
a <- tmp[,.(list(patchwork::wrap_plots(plot)))][[1]]

MAX_DEPTH = 5L
runlock = function(x, current_depth = 1L) {
  if (is.list(x) && current_depth <= MAX_DEPTH) {  # is.list() used to be is.recursive(), #4814
    if (inherits(x, 'data.table')) .Call(data.table:::C_unlock, x)
    else return(lapply(x, runlock, current_depth = current_depth + 1L))
  }
  return(invisible())
}

runlock(a)
#Error in `X[[i]]`:
#! Index out of bounds
#Run `rlang::last_trace()` to see where the error occurred.

If I replace the lapply loop with a for loop, it appears to work:

runlock = function(x, current_depth = 1L) {
  if (is.list(x) && current_depth <= MAX_DEPTH) {  # is.list() used to be is.recursive(), #4814
    if (inherits(x, 'data.table')) .Call(data.table:::C_unlock, x)
    else return(for(y in x) runlock(y, current_depth = current_depth + 1L))
  }
  return(invisible())
}

runlock(a)
#no error

I suspect the issue results from this code in lapply:

if (!is.vector(X) || is.object(X)) 
        X <- as.list(X)

This turns fun stuff like environments or objects of class "uneval" into lists. And apparently there can be dragons if you then try to iterate over these lists.

like image 54
Roland Avatar answered Dec 14 '25 07:12

Roland



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!