CNTK: Create MinibatchSource from numpy array for multi GPU training

Question

I have my pre-processed image data in numpy array, and my script works fine with a single GPU by feeding numpy array. From what I understood, we need to create MinibatchSource for multiple GPU training. I'm checking this example (ConvNet_CIFAR10_DataAug_Distributed.py) for distributed training, however it uses *_map.txt which is basically a list of paths to image file (ex. png). I'm wondering what the best way is to create MinibatchSource from numpy array, instead of converting numpy array back to png files.

Nikos Karampatziakis · Accepted Answer

You can create composite readers that combine multiple image deserializers into one source. First you need to create two map files (with dummy labels). One will contain all input images and the other will contain the corresponding target images. The following code is a minimal implementation, assuming the files are called map1.txt and map2.txt

import numpy as np
import cntk as C
import cntk.io.transforms as xforms 
import sys

def create_reader(map_file1, map_file2):
    transforms = [xforms.scale(width=224, height=224, channels=3, interpolations='linear')]
    source1 = C.io.ImageDeserializer(map_file1, C.io.StreamDefs(
        source_image = C.io.StreamDef(field='image', transforms=transforms)))
    source2 = C.io.ImageDeserializer(map_file2, C.io.StreamDefs(
        target_image = C.io.StreamDef(field='image', transforms=transforms)))
    return C.io.MinibatchSource([source1, source2], max_samples=sys.maxsize, randomize=True)

x = C.input_variable((3,224,224))
y = C.input_variable((3,224,224))
# world's simplest model
model = C.layers.Convolution((3,3),3, pad=True)
z = model(x)
loss = C.squared_error(z, y)

reader = create_reader("map1.txt", "map2.txt")
trainer = C.Trainer(z, loss, C.sgd(z.parameters, C.learning_rate_schedule(.00001, C.UnitType.minibatch)))

minibatch_size = 2

input_map={
    x: reader.streams.source_image,
    y: reader.streams.target_image
}

for i in range(30):
    data=reader.next_minibatch(minibatch_size, input_map=input_map)
    print(data)
    trainer.train_minibatch(data)

CNTK: Create MinibatchSource from numpy array for multi GPU training

Tags:

python

numpy

deep-learning

cntk

Naoto Usuyama

1 Answers

Nikos Karampatziakis

Recent Activity

Donate For Us

CNTK: Create MinibatchSource from numpy array for multi GPU training

Tags:

python

numpy

deep-learning

cntk

Naoto Usuyama

1 Answers

Nikos Karampatziakis

Related questions

Recent Activity

Donate For Us