I am testing printed digits (0-9) on a Convolutional Neural Network. It is giving 99+ % accuracy on the MNIST Dataset, but when I tried it using fonts installed on computer (Ariel, Calibri, Cambria, Cambria math, Times New Roman) and trained the images generated by fonts (104 images per font(Total 25 fonts - 4 images per font(little difference)) the training error rate does not go below 80%, i.e. 20% accuracy. Why?
Here is "2" number Images sample -

I resized every image 28 x 28.
Here is more detail :-
Training data size = 28 x 28 images. Network parameters - As LeNet5 Architecture of Network -
Input Layer -28x28
| Convolutional Layer - (Relu Activation);
| Pooling Layer - (Tanh Activation)
| Convolutional Layer - (Relu Activation)
| Local Layer(120 neurons) - (Relu)
| Fully Connected (Softmax Activation, 10 outputs)
This works, giving 99+% accuracy on MNIST. Why is so bad with computer-generated fonts? A CNN can handle lot of variance in data.
The CNN will converge more faster on values 0 to 1 than 0 to 255. So we divide every value by 255 to scale the data from [0.. 255] to [0..1]. It helps the model to better learning of features by decreasing computational complexities if we have data that scales bigger.
Handwritten digit recognition can be performed using the Convolutional neural network from Machine Learning. Using the MNIST (Modified National Institute of Standards and Technologies) database and compiling with the CNN gives the basic structure of my project development.
Yes, you can use a CNN. CNN's are not limited to just images. Use a 1D convolution, not a 2D convolution; you have 1D data, so a 1D convolution is more appropriate.
The applications of digit recognition include in postal mail sorting, bank check processing, form data entry, etc. The main problem lies within the ability on developing an efficient algorithm that can recognize hand written digits, which is submitted by users by the way of a scanner, tablet, and other digital devices.
I see two likely problems:
Preprocessing: MNIST is not only 28px x 28px, but also:
The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.
Source: MNIST website
Overfitting:
Remarks
Interesting idea! Did you try simply applying the trained MNIST network on your data? What are the results?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With