I have the following data frame in R:
ID  Information
1    Yes
1    NA
1    NA
1    Yes
2    No
2    NA
2    NA
3    NA
3    NA
3    Maybe
3    NA
I need to fill out the rows that contain NA's with whatever information is contained in one of the rows corresponding to that ID. I would like to have this:
ID  Information
1   Yes
1   Yes
1   Yes
1   Yes
2   No
2   No
2   No
3   Maybe
3   Maybe
3   Maybe
3   Maybe
As far as I know, the information (ie Yes/No/Maybe) is not conflicting within an ID but it may be repeated.(Sorry about the ugly format- I am a newbie and may not post pictures).
Thank you!
One option is using data.table.  We convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID', we assign (:=) 'Information' as the unique non-NA element.
library(data.table)#v1.9.5+
setDT(df1)[, Information:=unique(Information[!is.na(Information)]), by = ID]
df1
#     ID Information
#  1:  1         Yes
#  2:  1         Yes
#  3:  1         Yes
#  4:  1         Yes
#  5:  2          No
#  6:  2          No
#  7:  2          No
#  8:  3       Maybe
#  9:  3       Maybe
# 10:  3       Maybe
# 11:  3       Maybe
Or we can join the dataset with the unique rows of dataset after removing the 'NA' rows.  Here, I use the devel version of data.table
 setDT(unique(na.omit(df1)))[df1['ID'], on='ID'] 
Or we use dplyr, grouped by 'ID', we arrange the 'Information' so that 'NA' will be the last, create the 'Information' as the first value of 'Information'.
 library(dplyr)
 df1 %>%
    group_by(ID) %>% 
    arrange(Information) %>% 
    mutate(Information= first(Information))
Here is an option using na.locf with ddply 
library(zoo)
library(plyr)
ddply(d, .(ID), mutate, Information = na.locf(Information))
#   ID Information
#1   1         Yes
#2   1         Yes
#3   1         Yes
#4   1         Yes
#5   2          No
#6   2          No
#7   2          No
#8   3       Maybe
#9   3       Maybe
#10  3       Maybe
#11  3       Maybe
Or in base R:
uniqueCombns <- unique(dat[complete.cases(dat),])
merge(dat["ID"], uniqueCombns, by="ID", all.x=T)
where dat is your dataframe
Since DF$information is a valid "factor" and there are no conflictions, you could, also, do (unless I'm ignoring something):
levels(DF$Information)[approxfun(DF$ID, DF$Information, method = "constant")(DF$ID)]
# [1] "Yes"   "Yes"   "Yes"   "Yes"   "No"    "No"    "No"    "Maybe" "Maybe" "Maybe" "Maybe"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With