I have this code:
dat<-dat[,list(colA,colB
,RelativeIncome=Income/.SD[Nation=="America",Income]
,RelativeIncomeLog2=log2(Income)-log2(.SD[Nation=="America",Income])) #Read 1)
,by=list(Name,Nation)]
1) I would like to be able to say "RelativeIncomeLog2=log2(RelativeIncome)", but "RelativeIncome" is not available in j's scope?
2) I tried the following instead (per the data.table FAQ). Now "RelativeIncome" is available but it doesn't add the columns:
dat<-dat[,{colA;colB;RelativeIncome=Income/.SD[Nation=="America",Income];
,RelativeIncomeLog2=log2(RelativeIncome)]))
,by=list(Name,Nation)]
A column can be added to an existing data table using := operator. Here ':' represents the fixed values and '=' represents the assignment of values. So, they together represent the assignment of fixed values. Therefore, with the help of “:=” we will add 2 columns in the above table.
Method 1: using colnames() method colnames() method in R is used to rename and replace the column names of the data frame in R. The columns of the data frame can be renamed by specifying the new column names as a vector.
To add or insert observation/row to an existing Data Frame in R, we use rbind() function. We can add single or multiple observations/rows to a Data Frame in R using rbind() function.
You can create and assign objects in j, just use { curly braces }.
You can then pass these objects (or functions & calculations of the objects) out of j and assign them as columns of the data.table. To assign more than once column at a time, simply:
LHS in c(.) make sure column names are strings and j (ie, the "return" value) should be a list. dat[ , c("NewIncomeComlumn", "AnotherNewColumn") := {
RelativeIncome <- Income/.SD[Nation == "A", Income];
RelativeIncomeLog2 <- log2(RelativeIncome);
## this last line is what will be asigned.
list(RelativeIncomeLog2 * 100, c("A", "hello", "World"))
# assigned values are recycled as needed.
# If the recycling does not match up, a warning is issued.
}
, by = list(Name, Nation)
]
You can losely think of j as a function within the environment of dat
You can also get a lot more sophisticated and complex if required. You can also incorporate by arguments as well, using by=list(<someName>=col)
In fact, similar to functions, simply creating an object in j and assigning it a value, does not mean that it will be available outside of j. In order for it to be assigned to your data.table, you must return it. j automatically returns the last line; if that last line is a list, each element of the list will be handled as a column. If you are assigning by reference (ie, using := ) then you will achieve the results you are expecting.
On a separate note, I noticed the following in your code:
Income / .SD[Nation == "America", Income]
# Which instead could simply be:
Income / Income[Nation == "America"]
.SD is great in that it is a wonderful shorthand. However, to invoke it without needing all of the columns which it encapsulates is to burden your code with extra memory costs. If you are using only a single column, consider naming that column explicitly or perhaps add the .SDcols argument (after j) and being naming the columns needed there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With