I've looked everywhere but there doesn't seem to be a standard (I could see) of how one would go about checking to see if an image is blank. In C#
I have a way of doing this, but would love to know what the correct way is of checking to see if an image is blank, so everyone could also know in the future.
I'm not going to copy paste a bunch of code in, if you want me to, it will be my pleasure, but I just first want to explain how i go about checking to see if an image is blank.
You take a .jpg image, Get the width of it. For example 500 pixels Then you divide that by 2 giving you 250
Then you check what the colour of every pixel is in the location of (250 width, and i height) (where you iterate thought the hight of the image.
What this then do is only check the middle line of pixels of an image, vertically. It goes though all the pixels checking to see if the colour is anything Except white. I've done this so you wont have to search ALL 500*height of pixels and since you will almost always come across a colour in the middle of the page.
Its working... a bit slow...There must be a better way to do this? You can change it to search 2/3/4 lines vertically to increase your chance to spot a page that's not blank, but that will take even longer.
(Also note, using the size of the image to check if it contains something will not work in this case, since a page with two sentences on and a blank page's size is too close to one another)
After solution has been added.
Resources to help with the implementation and understanding of the solution.
(Note that on the first website, the stated Pizelformat is actually Pixelformat) - Small error i know, just mentioning, might cause some confusion to some.
After I implemented the method to speed up the pixel hunting, the speed didn't increase that much. So I would think I'm doing something wrong.
Old time = 15.63 for 40 images.
New time = 15.43 for 40 images
I saw with the great article DocMax quoted, that the code "locks" in a set of pixels. (or thats how i understood it) So what I did is lock in the middle row of pixels of each page. Would that be the right move to do?
private int testPixels(String sourceDir)
    {
         //iterate through images
        string[] fileEntries = Directory.GetFiles(sourceDir).Where(x => x.Contains("JPG")).ToArray();
        var q = from string x in Directory.GetFiles(sourceDir)
                where x.ToLower().EndsWith(".jpg")
                select new FileInfo(x);
        int holder = 1;
        foreach (var z in q)
        {
            Bitmap mybm= Bitmap.FromFile(z.FullName) as Bitmap;
            int blank = getPixelData2(mybm);
           
            if (blank == 0)
            {
                holder = 0;
                break;
            }
        }
        return holder;
    }
And then the class
private unsafe int getPixelData2(Bitmap bm)
        {
            BitmapData bmd = bm.LockBits(new System.Drawing.Rectangle((bm.Width / 2), 0, 1, bm.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bm.PixelFormat);
            int blue;
            int green;
            int red;
            int width = bmd.Width / 2;
            for (int y = 0; y < bmd.Height; y++)
            {
                byte* row = (byte*)bmd.Scan0 + (y * bmd.Stride);
                blue = row[width * 3];
                green = row[width * 2];
                red = row[width * 1];
                // Console.WriteLine("Blue= " + blue + " Green= " + green + " Red= " + red);
                //Check to see if there is some form of color
                if ((blue != 255) || (green != 255) || (red != 255))
                {
                    bm.Dispose();
                    return 1;
                }
            }
            bm.Dispose();
            return 0;
        }
If you can tolerate the chance of getting it wrong, the approach seems fine; I have done something very similar in my case, although I always had a visual confirmation to deal with errors.
For the performance, the key open question is how you are getting the pixels to test. If you are using Bitmap.GetPixel, you are bound to have performance problems. (Search for "Bitmap.GetPixel slow" in Google to see lots of discussion.)
Far better performance will come from getting all the pixels at once and then looping over them. I personally like Bob Powell's LockBits discussion for clarity and completeness. With that approach, checking all of the pixels may well be reasonable depending on your performance needs.
If you're using System.Drawing.Bitmap you can speed up things up (substantially), by:
Notes:
Edit: Beat to the punch by DocMax.
In any case for speed you can also try using alternative libraries such as the excellent FreeImage which includes C# wrappers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With