Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

storing an R function output list in a column of lists for further processing

Tags:

list

r

dplyr

OK, this is a pretty basic question that I am having a hard time finding an answer. I have functions that return several results that are returned as a list. I want to store that output list in a dataframe. The data frame also has the variables that are used in the function. For example:

library(dplyr)
### function
testFunc <- function(a){
  a = a
  b = a+1
  c = list(out1=a, out2=b)
  return(c)
}
### data
dat <- data.frame(x=1:5)

### dplyr processing
datProcessed <- dat %>%
  mutate(calcd = testFunc(x))

### Fails, `calcd` must be size 10 or 1, not 2.

However, if the output is a single item, of course it works:

datProcessed <- dat %>%
  mutate(calcd = testFunc(x)$out2)

How do I store a list output in the dataframe column of lists using a dplyr pipe?

like image 608
TC1 Avatar asked Oct 31 '25 02:10

TC1


2 Answers

Here are some options, depending wholly on your expected output and what you're going to do with it next.

(BTW: I'm using tibble(dat) instead of dat only to differentiate between vector-columns and list-columns, your production use does not need tibble(..).)

  1. If you want both vectors returned from testFunc() as individual columns in dat, then we can just do

    tibble(dat) |>
      mutate(as.data.frame(testFunc(x)))
    # # A tibble: 5 × 3
    #       x  out1  out2
    #   <int> <int> <dbl>
    # 1     1     1     2
    # 2     2     2     3
    # 3     3     3     4
    # 4     4     4     5
    # 5     5     5     6
    

    This works because mutate(.) (and other similar verb-functions in dplyr) appends columns if the value of the unnamed argument is a frame itself (it does not work with a named-list, though the differences between the two are very minor).

  2. If you want each of the pairs of the return values stored in a list-column per-row in dat, then we can use purrr::transpose:

    out <- dat |>
      mutate(calcd = purrr::transpose(testFunc(x)))
    out
    #   x calcd
    # 1 1  1, 2
    # 2 2  2, 3
    # 3 3  3, 4
    # 4 4  4, 5
    # 5 5  5, 6
    
    tibble(out)
    # # A tibble: 5 × 2
    #       x calcd           
    #   <int> <list>          
    # 1     1 <named list [2]>
    # 2     2 <named list [2]>
    # 3     3 <named list [2]>
    # 4     4 <named list [2]>
    # 5     5 <named list [2]>
    
    out$calcd[[1]]
    # $out1
    # [1] 1
    # $out2
    # [1] 2
    

    In this second form, each element in $calcd is a named list with one value each (based on how your testFunc(.) worked).

Both methods assume that the return from testFunc(.) is a named list of vectors where each vector is the same length as the number of rows.

If you aren't familiar with what purrr::transpose does, compare the change:

str(testFunc(dat$x))
# List of 2
#  $ out1: int [1:5] 1 2 3 4 5
#  $ out2: num [1:5] 2 3 4 5 6

str(purrr::transpose(testFunc(dat$x)))
# List of 5
#  $ :List of 2
#   ..$ out1: int 1
#   ..$ out2: num 2
#  $ :List of 2
#   ..$ out1: int 2
#   ..$ out2: num 3
#  $ :List of 2
#   ..$ out1: int 3
#   ..$ out2: num 4
#  $ :List of 2
#   ..$ out1: int 4
#   ..$ out2: num 5
#  $ :List of 2
#   ..$ out1: int 5
#   ..$ out2: num 6
like image 74
r2evans Avatar answered Nov 02 '25 18:11

r2evans


You probably want to apply your function to each row individually, in which case you could do:

library(tidyverse)
dat %>%
  mutate(calcd = apply(across(x), 1, testFunc))

This returns:

  x calcd
1 1  1, 2
2 2  2, 3
3 3  3, 4
4 4  4, 5
5 5  5, 6


'data.frame':   5 obs. of  2 variables:
 $ x    : int  1 2 3 4 5
 $ calcd:List of 5
  ..$ :List of 2
  .. ..$ out1: Named int 1
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 2
  .. .. ..- attr(*, "names")= chr "x"
  ..$ :List of 2
  .. ..$ out1: Named int 2
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 3
  .. .. ..- attr(*, "names")= chr "x"
  ..$ :List of 2
  .. ..$ out1: Named int 3
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 4
  .. .. ..- attr(*, "names")= chr "x"
  ..$ :List of 2
  .. ..$ out1: Named int 4
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 5
  .. .. ..- attr(*, "names")= chr "x"
  ..$ :List of 2
  .. ..$ out1: Named int 5
  .. .. ..- attr(*, "names")= chr "x"
  .. ..$ out2: Named num 6
  .. .. ..- attr(*, "names")= chr "x"

like image 40
deschen Avatar answered Nov 02 '25 19:11

deschen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!