Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting each list within a dataframe to a normal column

I produce a data frame from several sources from the web which are cleaned beforehand and then selected with

cleans <- ls() 
cleans <- cleans[grepl("Clean_News", cleans)]

My first attempt to bind them together was inspired by a solution on Stack Overflow:

All_News <- mapply(get, grep("Clean_News", ls(), value=T))
All_News <- data.frame(t(All_News))
All_News <- as.data.frame(All_News)

However, this is a problem for me, since the result is a dataframe, where each column is a list of ints or characters. So, my main question is how to convert each list within the dataframe to a normal column within the df. I tried many hand-made functions on Stack Overflow, but none worked for me (due to my inexperience, I guess...). The df has the form

All_News <- data.frame(a=I(list(1,1:2,1:3)), b=I(list(4:6,7:9,10:11)))

Alternatively, I tried the following, which works:

All_News <- do.call(rbind, lapply(cleans, get))

But has the huge disadvantage that I did not succeed in getting the names of the data frames as rownames / or first column into the data frame... So, my second question would be how to attach the names of the single data frames to each row of the huge df, instead of an id like the line of code below.

t2 <- rbindlist(lapply(cleans, get), idcol = "id") 

This does not much good since I need the names of all data frames x -times repeatedly as an identifier, e.g. AND since this is an automated process with thousands of webpages, I do not know beforehand the number of rows in each data frame. The data looks like:

 news1 data1 data2
 news1 data5 data6
 news2 data3 data4
 and so on.

I tried something along these lines

nr <- length(cleans)
names <- rep(cleans, nr)
names <- sort(names)

But without much success.

like image 364
litotes Avatar asked Sep 06 '25 02:09

litotes


1 Answers

We can do this by looping through the columns of dataset, unlist the list columns

lst <- lapply(All_News, unlist)

then, make the lengths of the list element same by padding NA at the end for those having less elements based on the maximum length (max(lengths(lst))) and convert it to data.frame

data.frame(lapply(lst, `length<-`, max(lengths(lst))))
like image 138
akrun Avatar answered Sep 08 '25 15:09

akrun