I have a wide format table with first 3 rows used to describe data that is shown in table. Eg:
Company: | Company A | Company B | Company C | | Company N
Data source: | Budget | Actual | Budget | ... | ...
Currency: | USD | EUR | USD | | ...
Indicator:
Sales 500 1000 1500 ... ...
Gross Income 200 300 400 ... ...
... ... ... ... ... ...
Indicator J ... ... ... ...
I want to reshape it to long format with following layout:
Indicator | Company | Currency | Data Source | Value
Sales | Company A | USD | Budget | 500
Sales | Company B | EUR | Actual | 1000
... | ... | ... | ... | ...
I've tried to melt it with reshape2 package, but didn't manage to convert rows 2 and 3 to variables
dput(AAA)
structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income",
"Indicator:", "Net Income", "Sales", "Source:"), class = "factor"),
V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("",
"1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"),
V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("",
"1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"),
V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("",
"185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"),
V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("",
"195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1",
"V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA,
-8L))
Here is a solution, involves to transpose your data and do some cleaning. Rest is done via 'melt':
AAA <- structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income",
"Indicator:", "Net Income", "Sales", "Source:"), class = "factor"),
V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("",
"1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"),
V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("",
"1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"),
V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("",
"185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"),
V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("",
"195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1",
"V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA,
-8L))
# transpose data
dft <- data.frame(t(AAA), stringsAsFactors=FALSE)
require(reshape2)
# set colnames
colnames(dft) <- dft[1, ]
dft <- dft[-1, ]
# remove empty indicator col
dft[ , 4] <- NULL
# melt data
melt(dft, id.vars=c('Company:', 'Source:', 'Currency:'), variable.name='Indicator:')
# Company: Source: Currency: Indicator: value
# 1 Company A Budget USD Sales 1000
# 2 Company B Actual USD Sales 1500
# 3 Company C Budget EUR Sales 2000
# 4 Company D Actual EUR Sales 2500
Maybe you'll need some more cleaning (now every col is character, maybe also set the colnames before transposing...).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With