I'm trying to get all combinations of rows from one column to itself, while keeping the values from a second column.
library(dplyr)
library(tidyr)
dt0 <-
  data.frame(
    row = letters[1:10],
    n1 = c(2, 2, 1, 3, 1, 5, 1, 3, 2, 2)
  )
dt0 |>
  expand(
    row1 = row,
    row2 = row
  ) |>
  filter(row1 < row2) |>
  left_join(
    dt0 |>
      rename(n1.x = n1),
    by = join_by(row1 == row)
  ) |>
  left_join(
    dt0 |>
      rename(n1.y = n1),
    by = join_by(row2 == row)
  )
the expected result is:
# A tibble: 45 × 4
   row1  row2   n1.x  n1.y
   <chr> <chr> <dbl> <dbl>
 1 a     b         2     2
 2 a     c         2     1
 3 a     d         2     3
 4 a     e         2     1
 5 a     f         2     5
 6 a     g         2     1
 7 a     h         2     3
 8 a     i         2     2
 9 a     j         2     2
10 b     c         2     1
# ℹ 35 more rows
# ℹ Use `print(n = ...)` to see more rows
But I don't know how to generalize this to generate all combinations of the elements of the rows of the data.frame taken m at a time, so my question is:
How can I generalize this pattern for any number of rows in expand(...)? For example, with three
dt0 |>
  expand(
    row1 = row,
    row2 = row,
    row3 = row
  ) |>
  filter(row1 < row2) |>
  filter(row2 < row3) |>
  left_join(
    dt0 |>
      rename(n1.x = n1),
    by = join_by(row1 == row)
  ) |>
  left_join(
    dt0 |>
      rename(n1.y = n1),
    by = join_by(row2 == row)
  ) |>
  left_join(
    dt0 |>
      rename(n1.z = n1),
    by = join_by(row3 == row)
  )
# A tibble: 120 × 6
   row1  row2  row3   n1.x  n1.y  n1.z
   <chr> <chr> <chr> <dbl> <dbl> <dbl>
 1 a     b     c         2     2     1
 2 a     b     d         2     2     3
 3 a     b     e         2     2     1
 4 a     b     f         2     2     5
 5 a     b     g         2     2     1
 6 a     b     h         2     2     3
 7 a     b     i         2     2     2
 8 a     b     j         2     2     2
 9 a     c     d         2     1     3
10 a     c     e         2     1     1
# ℹ 110 more rows
# ℹ Use `print(n = ...)` to see more rows
I guess combn rather than expand fits your purpose better
f <- function(dt0, k) {
    with(
        dt0,
        cbind(
            setNames(data.frame(t(combn(row, k))), paste0("row", seq(k))),
            setNames(data.frame(t(combn(n1, k))), paste0("n1.", seq(k)))
        )
    )
}
or
f <- function(dt0, k) {
    do.call(
        cbind,
        lapply(
            seq_along(dt0),
            \(i) setNames(
                data.frame(t(combn(dt0[[i]], k))),
                paste0(names(dt0[i]), ".", seq(k))
            )
        )
    )
}
such that
> head(f(dt0, 2), 10)
   row.1 row.2 n1.1 n1.2
1      a     b    2    2
2      a     c    2    1
3      a     d    2    3
4      a     e    2    1
5      a     f    2    5
6      a     g    2    1
7      a     h    2    3
8      a     i    2    2
9      a     j    2    2
10     b     c    2    1
> head(f(dt0, 3), 10)
   row.1 row.2 row.3 n1.1 n1.2 n1.3
1      a     b     c    2    2    1
2      a     b     d    2    2    3
3      a     b     e    2    2    1
4      a     b     f    2    2    5
5      a     b     g    2    2    1
6      a     b     h    2    2    3
7      a     b     i    2    2    2
8      a     b     j    2    2    2
9      a     c     d    2    1    3
10     a     c     e    2    1    1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With