I have wide longitudinal data that I would like to reshape into long data. This is a sample:
sex group id sex.1 group.1 status1 beg1 end1 status2 beg2 end2
1 1000 1 a 1000 1 a Vocational <NA> S2007 HE S2007 S2008
2 1001 1 a 1001 1 a Vocational <NA> S2007 HE S2008 S2012
3 1004 1 a 1004 1 a Vocational <NA> S2008 999 <NA> <NA>
4 1006 2 a 1006 2 a Vocational <NA> S2007 Army S2012 <NA>
5 1007 1 a 1007 1 a HE <NA> S2007 999 <NA> <NA>
6 1008 1 a 1008 1 a Vocational S2013 <NA> 999 <NA> <NA>
I need to get it in this shape, compatible with SPELL format:
id sex group index status beg end
1000 1 a 1 Vocational NA S2007
1000 1 a 2 HE S2008 S2012
...
I am using the following command:
spell <- reshape(data,
varying=names(data)[4:60],
direction="long",
idvar=c("id","sex","group"),
sep="")
And I get the following error message:
Error in `row.names<-.data.frame`(`*tmp*`, value = paste(d[, idvar], times[1L], :
duplicate 'row.names' are not allowed
In addition: Warning message: non-unique value when setting 'row.names': ‘NA.1’
I have tried setting NA values to 999 this way, but it does not work.
data[is.na(data)] <- 999
Do you know what may get this to work? thanks a lot beforehand!
That error message indicates that you either have duplicate rows or missing values in the id variable(s).
Check for duplicates first:
with(data, any(duplicated(cbind(id, sex, group))))
If TRUE, then there's your answer.
If FALSE, then you may have missing values in the id variable(s), maybe even whole missing rows, and probably at the end. This can be due to the actual source data having blank rows or your R command to import the data, for example using read_excel and specifying too many rows in the range argument. In any case, check the data carefully for missing values in the id variable(s). Replacing them all with 999 won't help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With