R- Replace all values in rows of dataframe after first NA by NA

Question

I have a dataframe of 3500 observations and 278 variables. For each row going from the first column, I want to replace all values occurring after the first NA by NAs. For instance, I want to go from a dataframe like so:

X1 X2 X3 X4 X5
 1  3 NA  6  9
 1 NA  4  6 18
 6  7 NA  3  1 
10  1  2 NA  2

To something like

X1 X2 X3 X4 X5
 1  3 NA NA NA
 1 NA NA NA NA
 6  7 NA NA NA 
10  1  2 NA NA

I tried using the following nested for loop, but it is not terminating:

for(i in 2:3500){
 firstna <- min(which(is.na(df[i,])))
 df[i, firstna:278] <- NA
}

Is there a more efficient way to do this? Thanks in advance.

Jota · Accepted Answer

You could do something like this:

# sample data
mat <- matrix(1, 10, 10)
set.seed(231)
mat[sample(100, 7)] <- NA

You can use apply with cumsum and is.na to keep track of where NAs need to be placed (i.e. places across the row where the cumulative sum of NAs is greater than 0). Then, use those locations to assign NAs to the original structure in the appropriate places.

mat[t(apply(is.na(mat), 1, cumsum)) > 0 ] <- NA
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,]    1    1    1    1    1    1   NA   NA   NA    NA
# [2,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA
# [3,]    1    1    1    1    1    1    1    1    1     1
# [4,]    1    1    1    1    1    1    1    1    1     1
# [5,]    1    1    1   NA   NA   NA   NA   NA   NA    NA
# [6,]    1    1    1    1    1    1    1    1    1     1
# [7,]    1   NA   NA   NA   NA   NA   NA   NA   NA    NA
# [8,]    1    1    1    1    1    1    1    1    1     1
# [9,]    1    1    1    1    1    1    1    1    1     1
#[10,]    1    1   NA   NA   NA   NA   NA   NA   NA    NA

Works the fine with data frames. Using the provided example data:

d<-read.table(text="
X1 X2 X3 X4 X5
 1  3 NA  6  9
 1 NA  4  6 18
 6  7 NA  3  1 
10  1  2 NA  2 ", header=TRUE)

d[t(apply(is.na(d), 1, cumsum)) > 0 ] <- NA
#  X1 X2 X3 X4 X5
#1  1  3 NA NA NA
#2  1 NA NA NA NA
#3  6  7 NA NA NA
#4 10  1  2 NA NA

akrun · Answer

We can use rowCumsums from library(matrixStats)

library(matrixStats)
d*NA^rowCumsums(+(is.na(d)))
#  X1 X2 X3 X4 X5
#1  1  3 NA NA NA
#2  1 NA NA NA NA
#3  6  7 NA NA NA
#4 10  1  2 NA NA

Or a base R option is

d*NA^do.call(cbind,Reduce(`+`,lapply(d, is.na), accumulate=TRUE))

R- Replace all values in rows of dataframe after first NA by NA

Tags:

r

na

Prasad

2 Answers

Jota

akrun

Recent Activity

Donate For Us

R- Replace all values in rows of dataframe after first NA by NA

Tags:

r

na

Prasad

2 Answers

Jota

akrun

Related questions

Recent Activity

Donate For Us