Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"dims [product 0] do not match the length of object" error in R when using daply for frequency counts

I have a list of data.frames that looks like this:

df=data.frame(
data_id=rep(LETTERS[1:10],each=1),
data_value=c(1,2,2,3,3,2,3,1,1,3))
df2=data.frame(
data_id=rep(LETTERS[1:10],each=1),
data_value=c(2,1,3,1,1,1,2,1,2,1))
df3=data.frame(
data_id=rep(LETTERS[1:10],each=1),
data_value=c(2,2,3,3,1,2,2,1,2,3))
df.list <- list(df, df2, df3)

A single data.frame looks like this:

         data_id    data_value
1        A          1
2        B          2
3        C          2
4        D          3
5        E          3
6        F          2
7        G          3
8        H          1
9        I          1
10       J          3

I want to have a frequency count of how often each unique value appears in data_value. I can do this:

for(i in 1:length(df.list)){
    daply(df.list[[i]], .(df.list[[i]]$data_value), nrow) -> freq
}

Which gives me the frequency count (in this case just the last one, for df3):

1 2 3 
2 5 3 

My actual dataset is far bigger so I cannot post it here. It has the exact same structure, however. The problem is that when I try to get the frequency counts for my actual dataset, I get the following error message:

Error in dim(out_array) <- out_dim : dims [product 0] do not match the length of object [1]

Can anyone tell me where I need to start looking to fix this? I don't understand where 'dim()' comes in and what it does. Many thanks.

like image 931
Annemarie Avatar asked Oct 20 '25 09:10

Annemarie


1 Answers

You can actually do one better than that, by replacing the for-loop with a laply, which means input is a list and output is a matrix/array.

o <- laply(df.list, function(x) {
    table(x$data_value)
})
> o
#      1 2 3
# [1,] 3 3 4
# [2,] 6 3 1
# [3,] 2 5 3

In order to check the reason for your error, what happens when you try this?

o <- llply(df.list, function(x) {
    table(x$data_value)
})

Edit: To make the error more understandable, let us create this data.frame:

d1 <- data.frame(a=1:4)
d2 <- data.frame(a=1:5)
d3 <- data.frame(a=1:6)
d4 <- data.frame(a=1:7)

dl <- list(d1,d2,d3,d4)

Now run laply:

laply(dl, function(x) table(x$a))
# Error: Results must have the same dimensions.

why? To see that, let's print it:

> laply(dl, function(x) print(table(x$a)))

# 1 2 3 4 
# 1 1 1 1 
# 
# 1 2 3 4 5 
# 1 1 1 1 1 
# 
# 1 2 3 4 5 6 
# 1 1 1 1 1 1 
# 
# 1 2 3 4 5 6 7 
# 1 1 1 1 1 1 1 

# Error: Results must have the same dimensions.

You see the problem? The number of elements in each row are different. You can NOT have a matrix (unless you append those with smaller elements to equal the rows).

Instead, use a list so that they will be elements of a list which can be accessed later using [[number]] syntax.

llply(dl, function(x) table(x$a))

# [[1]]
# 
# 1 2 3 4 
# 1 1 1 1 
# 
# [[2]]
# 
# 1 2 3 4 5 
# 1 1 1 1 1 
# 
# [[3]]
# 
# 1 2 3 4 5 6 
# 1 1 1 1 1 1 
# 
# [[4]]
# 
# 1 2 3 4 5 6 7 
# 1 1 1 1 1 1 1 

Hope this clears things up.

like image 125
Arun Avatar answered Oct 21 '25 22:10

Arun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!