Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Segmenting cursive character (Arabic OCR)

I want to segment an Arabic word into single characters. Based on the histogram/profile, I assume that I can do the segmentation process by cut/segment the characters based on it's baseline (it have similar pixel values). But, unfortunately, I still stuck to build the appropriate code, to make it works.

% Original Code by Soumyadeep Sinha 
% Saving each  single segmented character as one file 
function [segm] = trysegment (a)
 myFolder = 'D:\1. Thesis FINISH!!!\Data set\trial';
 level = graythresh (a);
 bw = im2bw (a, level);
 b = imcomplement (bw);
 i= padarray(b,[0 10]);
 verticalProjection = sum(i, 1);
 set(gcf, 'Name', 'Trying Segmentation for Cursive', 'NumberTitle', 'Off') 
 subplot(2, 2, 1);imshow(i); 
 subplot(2,2,3);
 plot(verticalProjection, 'b-'); %histogram show by this code
% hist(reshape(input,[],3),1:max(input(:))); 
 grid on;
 % % t = verticalProjection;
 % % t(t==0) = inf;
 % % mayukh = min(t)
% 0 where there is background, 1 where there are letters
 letterLocations = verticalProjection > 0; 
 % Find Rising and falling edges
 d = diff(letterLocations);
 startingColumns = find(d>0);
 endingColumns = find(d<0);
% Extract each region
 y=1;
 for k = 1 : length(startingColumns)
  % Get sub image of just one character...
    subImage = i(:, startingColumns(k):endingColumns(k)); 
%   se = strel('rectangle',[2 4]);
%   dil = imdilate(subImage, se); 
  th = bwmorph(subImage,'thin',Inf);
  n = imresize (th, [64 NaN], 'bilinear');
  figure, imshow (n);  
[L,num] = bwlabeln(n);
for z= 1 : num
bw= ismember(L, z);
% Construct filename for this particular image.
 baseFileName = sprintf('char %d.png', y);
 y=y+1;
% Prepend the folder to make the full file name.
 fullFileName = fullfile(myFolder, baseFileName);
% Do the write to disk.
 imwrite(bw, fullFileName);
% subplot(2,2,4);
% pause(2);
% imshow(bw);
end
% y=y+1;
end;
segm = (n);

Word image is as follow: <code>Segmenting cursive character</code>

Why the code isn't work? do you have any recommendation of another codes? or suggested algorithm to make it works, to do a good segmentation on cursive character?

Thanks before.

like image 340
Ana Ain Avatar asked Jan 18 '26 18:01

Ana Ain


1 Answers

Replace this code part from the posted code

% 0 where there is background, 1 where there are letters
 letterLocations = verticalProjection > 0; 
 % Find Rising and falling edges
 d = diff(letterLocations);
 startingColumns = find(d>0);
 endingColumns = find(d<0);

with the new code part

threshold=max(verticalProjection)/3;
thresholdedProjection=verticalProjection > threshold;
count=0;
startingColumnsIndex=0;
for i=1:length(thresholdedProjection)
    if thresholdedProjection(i)
        if(count>0)
            startingColumnsIndex=startingColumnsIndex+1;
            startingColumns(startingColumnsIndex)= i-floor(count/2);
            count=0;
        end
    else
        count=count+1;
    end
end
endingColumns=[startingColumns(2:end)-1 i-floor(count/2)];

No changes needed for the rest of the code.

like image 115
Rijul Sudhir Avatar answered Jan 21 '26 07:01

Rijul Sudhir