Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ML.NET IDataView back to csv

Assume I have this sample data:

Sample.csv:

Dog,25
Cat,23
Cat,20
Dog,0

And I want to load it to the IDataView, the transform it to be ready for ML (without strings and so), then save it again as .csv, say to analyze it with another tool or languages.

// Load data:
var sampleCsv = Path.Combine("Data", "Sample.csv");
var columns = new[]
{
    new TextLoader.Column("type", DataKind.String, 0),
    new TextLoader.Column("age", DataKind.Int16, 1),
};
var mlContext = new MLContext(seed: 0);
var dataView = mlContext.Data.LoadFromTextFile(sampleCsv, columns,',');

// Transform
var pipeline =
    mlContext.Transforms.Categorical.OneHotEncoding("type",
        // This outputKind will add just one column, while others will add some:
        outputKind: OneHotEncodingEstimator.OutputKind.Key);
var transformedDataView = pipeline.Fit(dataView).Transform(dataView);
//  transformedDataView:
//  Dog,1,25
//  Cat,2,23
//  Cat,2,20
//  Dog,1,0

How to get the two numbers columns and write them to the .csv file?

like image 706
baruchiro Avatar asked Oct 25 '25 15:10

baruchiro


2 Answers

You can create a class for your output data:

class TempOutput
{
    // Note that the types should be the same from the DataView
    public UInt32 type { get; set; }
    public Int16 age { get; set; }
}

Then use CreateEnumerable<> to read all rows from the DataView and print them to `.csv. file:

File.WriteAllLines(sampleCsv + ".output",
    mlContext.Data.CreateEnumerable<TempOutput>(transformedDataView, false)
    .Select(t => string.Join(',', t.type, t.age)));
like image 78
baruchiro Avatar answered Oct 28 '25 06:10

baruchiro


I use the following code in my own project to create a .csv file. Hope this helps.

var predictions = mlContext.Data.CreateEnumerable<SpikePrediction>(transformedData, reuseRowObject: false);

SavePredictions(predictions.ToArray());

private void SavePredictions(SpikePrediction[] predictions) {
if (dict.Count() != predictions.Count()) {
    Console.WriteLine("> Cannot save predictions because it does not correspond with the dataset length");
    return;
}
List<string> predictionsCol = _dataCol.ToList();
predictionsCol.Add("Label");

var fullResultFilePath = Path.Combine(_dataPath, FileHandeling.resultFolder, $"{_modelName}.csv");
using (var stream = File.CreateText(fullResultFilePath)) {
    stream.WriteLine(string.Join(",", predictionsCol));
    for (int i = 0; i < predictions.Count(); i++) {
        var label = predictions[i];
        stream.WriteLine(string.Join(",", new string[] { dict[i].Item1.Split("T")[0].Substring(1), dict[i].Item2, label.Prediction[0].ToString() }));
    }
}
}
like image 28
spoilerd do Avatar answered Oct 28 '25 04:10

spoilerd do