Melt and Cast are popular operations to handle data in R. In F# that would be sequences of records of same type or something close to it.
Are you aware of any such functions in F#?
(If not, who would be interested in making some strongly typed version of them...)
More information:
Melt takes a table as input. It has column title (our record fields), and a series of rows. Those column can be grouped into a set of 'identifier' and a set of 'variables'
Melt puts this table in a new canonical form with the columns being now : the identifiers, the column named @"variable", the column named @"value"
If you had 10 'variables' originally, like size, weight, etc.. you will have for each previous record, 10 records in the canonical form, with the values in the column @'variable' being filled with the title of the previous columns from your 'variables'
Cast, conversely, reconstruct a table from a melted one.
A short example in R, melt takes data (dat) that looks like this:
a b c
1 1 0.48411551 0.2372291
2 2 0.58850308 0.3968759
3 3 0.74412592 0.9718320
4 4 0.93060118 0.8665092
5 5 0.01556804 0.2512399
and makes it look like this:
> melt(dat,id.vars = "a")
a variable value
1 1 b 0.48411551
2 2 b 0.58850308
3 3 b 0.74412592
4 4 b 0.93060118
5 5 b 0.01556804
6 1 c 0.23722911
7 2 c 0.39687586
8 3 c 0.97183200
9 4 c 0.86650918
10 5 c 0.25123992
cast essentially does the reverse.
Those 2 operations are extremely powerful on a day to day basis. Once you have them it changes your thinking, very much like FP does.
Assuming melt is similar to SQL Server's unpivot, this ought to do the trick:
let melt keys (table: DataTable) =
let out = new DataTable()
let keyCols, otherCols =
table.Columns
|> Seq.cast<DataColumn>
|> Seq.toArray
|> Array.partition (fun c -> keys |> Seq.exists (fun k -> k = c.ColumnName))
for c in keyCols do
out.Columns.Add(c.ColumnName) |> ignore
out.Columns.Add("Key", typeof<string>) |> ignore
out.Columns.Add("Value") |> ignore
for r in table.Rows do
for c in otherCols do
let values = [|
for c in keyCols do yield r.[c]
yield box c.ColumnName
yield r.[c]
|]
out.Rows.Add(values) |> ignore
out
Here's a little test to try it out:
let table = new DataTable()
[|"Country", typeof<string>
"2001", typeof<int>
"2002", typeof<int>
"2003", typeof<int>|]
|> Array.map (fun (name, typ) -> new DataColumn(name, typ))
|> table.Columns.AddRange
[
"Nigeria", 1, 2, 3
"UK", 2, 3, 4
]
|> List.iter (fun (a, b, c, d) -> table.Rows.Add(a, b, c, d) |> ignore)
let table2 = table |> melt ["Country"]
table2.Rows
|> Seq.cast<DataRow>
|> Seq.iter (fun r ->
for (c: DataColumn) in table2.Columns do
printfn "%A: %A" c.ColumnName r.[c]
printfn "")
which yields
"Country": "Nigeria"
"Key": "2001"
"Value": "1"
"Country": "Nigeria"
"Key": "2002"
"Value": "2"
...
Assuming cast goes the other way (i.e. pivot), you should be able to take this code and come up with a translation.
If you're doing this a lot, you might find it easier to load your data into SQL Server and use the built-in operators.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With