Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge a list of dataframes into a single dataframe in R

Tags:

merge

dataframe

r

I have a very large list (just one list) with 13,500 elements in it. Each element is a dataframe with 1 row and 12 columns, each dataframe is structured the same (same columns names and similar data in each column). I want to merge all elements in this list into one dataframe. Essentially, the new dataframe will have 13,500 rows and 12 columns. I need everything in one dataframe to work with ggplot and need to be able to work with data as a dataframe. Can someone suggest the best way to do this? Thanks for the help.

I tried using the purr:: merge() function and was not successful. Or at least the process did not finish in more than 10 min and I had to terminate R studio.

Here are some data from the list:

list(structure(list(n1 = 10, n2 = 10, mean_1 = 0, mean_2 = 0, var_1 = 1, var_2 = 1, tpooled = 2.93152220266846, pvalue_pooled = 0.00891647393074033, result_pooled = 1, t_unpooled = 2.93152220266846, pvalue_unpooled = 0.00931815204271521, result_unpooled = 1), class = "data.frame", row.names = "n1"), structure(list(n1 = 30, n2 = 10, mean_1 = 0, mean_2 = 0, var_1 = 1, var_2 = 1, tpooled = -0.312649684961248, pvalue_pooled = 0.756256229272491, result_pooled = 0, t_unpooled = -0.248766791009062, pvalue_unpooled = 0.808124700588531, result_unpooled = 0), class = "data.frame", row.names = "n2"))
like image 852
trailblazer_1 Avatar asked Oct 20 '25 17:10

trailblazer_1


1 Answers

You can use bind_rows from dplyr, which will create one dataframe from the list of dataframes, and is a fairly efficient option.

library(dplyr)

bind_rows(ll)

Results

       n1 n2 mean_1 mean_2 var_1 var_2    tpooled pvalue_pooled result_pooled t_unpooled pvalue_unpooled result_unpooled
n1...1 10 10      0      0     1     1  2.9315222   0.008916474             1  2.9315222     0.009318152               1
n1...2 30 10      0      0     1     1 -0.3126497   0.756256229             0 -0.2487668     0.808124701               0

However, as @nicola mentioned, rbindlist from data.table will likely be the fastest option.

data.table::rbindlist(ll)

Then, you can always turn the data.table back into a dataframe, if you do not want to work with a data.table:

data.table::rbindlist(ll) %>% 
  as.data.frame()
like image 159
AndrewGB Avatar answered Oct 23 '25 07:10

AndrewGB