Assume I have this sample data:
Sample.csv:
Dog,25
Cat,23
Cat,20
Dog,0
And I want to load it to the IDataView, the transform it to be ready for ML (without strings and so), then save it again as .csv, say to analyze it with another tool or languages.
// Load data:
var sampleCsv = Path.Combine("Data", "Sample.csv");
var columns = new[]
{
new TextLoader.Column("type", DataKind.String, 0),
new TextLoader.Column("age", DataKind.Int16, 1),
};
var mlContext = new MLContext(seed: 0);
var dataView = mlContext.Data.LoadFromTextFile(sampleCsv, columns,',');
// Transform
var pipeline =
mlContext.Transforms.Categorical.OneHotEncoding("type",
// This outputKind will add just one column, while others will add some:
outputKind: OneHotEncodingEstimator.OutputKind.Key);
var transformedDataView = pipeline.Fit(dataView).Transform(dataView);
// transformedDataView:
// Dog,1,25
// Cat,2,23
// Cat,2,20
// Dog,1,0
How to get the two numbers columns and write them to the .csv file?
You can create a class for your output data:
class TempOutput
{
// Note that the types should be the same from the DataView
public UInt32 type { get; set; }
public Int16 age { get; set; }
}
Then use CreateEnumerable<> to read all rows from the DataView and print them to `.csv. file:
File.WriteAllLines(sampleCsv + ".output",
mlContext.Data.CreateEnumerable<TempOutput>(transformedDataView, false)
.Select(t => string.Join(',', t.type, t.age)));
I use the following code in my own project to create a .csv file. Hope this helps.
var predictions = mlContext.Data.CreateEnumerable<SpikePrediction>(transformedData, reuseRowObject: false);
SavePredictions(predictions.ToArray());
private void SavePredictions(SpikePrediction[] predictions) {
if (dict.Count() != predictions.Count()) {
Console.WriteLine("> Cannot save predictions because it does not correspond with the dataset length");
return;
}
List<string> predictionsCol = _dataCol.ToList();
predictionsCol.Add("Label");
var fullResultFilePath = Path.Combine(_dataPath, FileHandeling.resultFolder, $"{_modelName}.csv");
using (var stream = File.CreateText(fullResultFilePath)) {
stream.WriteLine(string.Join(",", predictionsCol));
for (int i = 0; i < predictions.Count(); i++) {
var label = predictions[i];
stream.WriteLine(string.Join(",", new string[] { dict[i].Item1.Split("T")[0].Substring(1), dict[i].Item2, label.Prediction[0].ToString() }));
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With