Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

integer64 class in r data.table, sum() and by=.()

Tags:

r

data.table

I just noticed this issue with a column in a data.table that turned out to be of the integer64 class. I was reading the data using fread from a location on the internet and was not aware that the column in question was being interpreted as integer64, a class I am not familiar with. The issue is how this class behaves in a data.table when using sum() and by. It has been referenced similarly in two other questions on here, but that was in the context of using it as an ID value (Q1 and Q2)

When performing a sum() by group on this integer64 column, it does not behave as expected (as a numeric) when there are negative values in the column. Why is this? Is it a bug?

library(data.table); library(bit64)

z <- data.table(
  group = c("A","A","A"),
  int64 = as.integer64(c(10,20,-10)),
  numeric = c(10,20,-10)
)

To start, it works fine without the by statement:

z[, sum(int64)]  #20
z[, sum(int64, na.rm=T)] #20

And in non-data.table format

sum(z$int64)
sum(z$int64, na.rm = TRUE)

But when including the by statement, it gets fishy:

    z[, sum(int64, na.rm=FALSE), by=group] #only the negative value
    #group  V1
    #A     -10

    z[, sum(int64, na.rm=TRUE), by=group] #excluding the negative value
    #group  V1
    #A      30

    z[, sum(as.numeric(int64)), by=group] #expected answer
    #group  V1
    #A      20

This is worrying to me as on the surface level there is no reason to believe anything is wrong with the numbers in z$int64 and I only noticed as there were very few rows.

like image 532
moman822 Avatar asked Mar 16 '26 02:03

moman822


1 Answers

This has now been corrected, see https://github.com/Rdatatable/data.table/issues/1647

z[, sum(int64, na.rm=FALSE), by=group]
#    group    V1
#   <char> <i64>
#1:      A    20

z[, sum(int64, na.rm=TRUE), by=group]
#    group    V1
#   <char> <i64>
#1:      A    20

z[, sum(as.numeric(int64)), by=group]
#    group    V1
#   <char> <num>
#1:      A    20
like image 65
Waldi Avatar answered Mar 19 '26 10:03

Waldi