Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Store first non-missing value in a new column

Tags:

r

extract

apply

Ciao, I have several columns that represents scores. For each STUDENT I want to take the first non-NA score and store it in a new column called TEST.

Here is my replicating example. This is the data I have now:

df <- data.frame(STUDENT=c(1,2,3,4,5), 
                 CLASS=c(90,91,92,93,95),
                 SCORE1=c(10,NA,NA,NA,NA), 
                 SCORE2=c(2,NA,8,NA,NA), 
                 SCORE3=c(9,6,6,NA,NA),
                 SCORE4=c(NA,7,5,1,9),
                 ROOM=c(01,02, 03, 04, 05))

This is the column I am aiming to add:

df$FIRST <- c(10,6,8,1,9)

This is my attempt:

df$FIRSTGUESS <- max.col(!is.na(df[3:6]), "first")
like image 952
bvowe Avatar asked Dec 01 '25 22:12

bvowe


2 Answers

This is exactly what coalesce from package dplyr does. As described in its documentation:

Given a set of vectors, coalesce() finds the first non-missing value at each position.

Therefore, you can simplify do:

library(dplyr)
df$FIRST <- do.call(coalesce, df[grepl('SCORE', names(df))])

This is the result:

> df
  STUDENT CLASS SCORE1 SCORE2 SCORE3 SCORE4 ROOM FIRST
1       1    90     10      2      9     NA    1    10
2       2    91     NA     NA      6      7    2     6
3       3    92     NA      8      6      5    3     8
4       4    93     NA     NA     NA      1    4     1
5       5    95     NA     NA     NA      9    5     9
like image 188
mt1022 Avatar answered Dec 04 '25 11:12

mt1022


You can do this with apply and which.min(is.na(...))

df$FIRSTGUESS <- apply(df[, grep("^SCORE", names(df))], 1, function(x) 
    x[which.min(is.na(x))])
df
#  STUDENT CLASS SCORE1 SCORE2 SCORE3 SCORE4 ROOM FIRSTGUESS
#1       1    90     10      2      9     NA    1         10
#2       2    91     NA     NA      6      7    2          6
#3       3    92     NA      8      6      5    3          8
#4       4    93     NA     NA     NA      1    4          1
#5       5    95     NA     NA     NA      9    5          9

Note that we need is.na instead of !is.na because FALSE corresponds to 0 and we want to return the first (which.min) FALSE value.

like image 21
Maurits Evers Avatar answered Dec 04 '25 12:12

Maurits Evers