Windows Forms: Recaptcha using tesseract ocr in C#

This tutorial shows how to solve recaptcha using tesseract ocr in C#.NET Windows Forms Application.

You can drag the PictureBox, TextBox and Button from the visual studio toolbox to your winform, then design a simple UI as shown below that allows you to select an image from your directory. Next, click the Detect button to perform image recognition in c#.

recaptcha c#

To recaptcha in c# you can use AForge and Tesseract. It's a c# ocr free, you can search and install the AForge and Tesseract libraries from the Nuget Manage Packages in your visual studio.

Create the OCR method allows you to perform image recognition in c# as shown below.

private string OCR(Bitmap bmp)
{
    using (TesseractEngine engine = new TesseractEngine(@"tessdata", "eng", EngineMode.Default))
    {
        engine.SetVariable("tessedit_char_whitelist", "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ");
        engine.SetVariable("tessedit_unrej_any_wd", true);
        using (var page = engine.Process(bmp, PageSegMode.SingleLine))
        {
            return page.GetText();
        }
    }
}

You should download the tesseract-ocr, then unzip and copy the tesseract into your project. And don't forget to set the Copy to Output Directory property of your files in the tesseract folder to "Copy always".

Next, Create the DeCaptcha method to solve captcha code using tesseract c# example. As you know, tesseract is a ocr library c#.

private string DeCaptcha(Image img)
{
    Bitmap bmp = new Bitmap(img);
    bmp = bmp.Clone(new Rectangle(0, 0, img.Width, img.Height), System.Drawing.Imaging.PixelFormat.Format24bppRgb);
    Erosion erosion = new Erosion();
    Dilatation dilatation = new Dilatation();
    Invert inverter = new Invert();
    ColorFiltering cor = new ColorFiltering();
    cor.Blue = new AForge.IntRange(200, 255);
    cor.Red = new AForge.IntRange(200, 255);
    cor.Green = new AForge.IntRange(200, 255);
    Opening open = new Opening();
    BlobsFiltering bc = new BlobsFiltering() { MinHeight = 10 };
    Closing close = new Closing();
    GaussianSharpen gs = new GaussianSharpen();
    ContrastCorrection cc = new ContrastCorrection();
    FiltersSequence seq = new FiltersSequence(gs, inverter, open, inverter, bc, inverter, open, cc, cor, bc, inverter);
    pictureBox1.Image = seq.Apply(bmp);
    return OCR((Bitmap)pictureBox1.Image);
}

And don't forget to include the namespaces as shown below to your form.

using AForge.Imaging.Filters;
using System;
using System.Drawing;
using System.Windows.Forms;
using Tesseract;

To open an image in c#, you need to add code to handle the btnBrowse click event as the following c# code.

private void btnBrowse_Click(object sender, EventArgs e)
{
    using (OpenFileDialog ofd = new OpenFileDialog() { Filter = "JPG|*.jpg|PNG|*.png" })
    {
        if (ofd.ShowDialog() == DialogResult.OK)
            pictureBox1.Image = Image.FromFile(ofd.FileName);
    }
}

Finally, Add code to handle Detect button click event allows you to use ocr engine to solve captcha image recognition.

private void btnDetect_Click(object sender, EventArgs e)
{
    txtOuput.Text = DeCaptcha(pictureBox1.Image);
}

So you have learned how to use tesseract ocr engine to solve decaptcha in c#. Tesseract is a c# open source ocr library free.

Related