I’ve developed a simple OCR CAPTCHA decoder in PHP, it’s pretty fast and breaks with a 100% success rate all CAPTCHAs that meet some requirements.
Requirements:
- Vertically constant background color pattern.
- Simple characters, without twists or deformations.
Types of CAPTCHAS that I was able to break:

Types of CAPTCHAS that I was NOT able to break:

That’s it, I won’t be explaining much more due to obvious reasons (although the source code is pretty straightforward). You can obtain it here.
Anyone has ideas on how to improve it?
Fantastic, do you think its possible to evolve that captcha decoder even more? Google captcha wasn’t possible to decode, but will it be some day?
Hope you get enough feedback, even some code feedback from the PHP community.
Yes, with a good algorithm that finds and maps high contrast areas in the CAPTCHA could filter the text part, and later the extracted paths could be processed with neural networks to find the text itself, but that’s just theory, the practice is much more complex for the normal understanding.
For further reading I’ll leave you with some interesting links: http://www.cs.sfu.ca/~mori/research/gimpy/ and http://www.brains-n-brawn.com/aiCaptcha
hi, looks good
im very inexperienced at this kind of thing
how would i use that code you provided to be able to break simple captcha’s on websites?
How to use PHP CAPTCHA Decoder?
Igor Cemim
To use this ocr save the php code as ocr-class.php or some other name.
create another php file lets say test.php in this file we include the ocr class at the top of the file like include(”ocr-class.php”);
Make a folder called OCR you need to make it so it is writable (chmod).
files will be like this
root/OCR/ (chmod so writable)
root/ocr-class.php
root/test.php
here is an example of using the class
Read(’captcha training id’, ‘caotcha training code’, ‘Captcha training img url’);
echo $ocr->Train(’Test’, ‘B6148′, ‘http://www.alixaxel.com/wordpress/wp-content/2007/06/captcha_1.gif’);
// in the code above Test is like the id that im using for the example lets say you have loads of difrent type of captchas that you need to change you need to give it an id name
// the B6148 is the captcha code which you are training the ocr for the selected training image.
//*****************************************//
// Once you have trained the ocr with about 30 difrent captchas values you are ready for the next step to see if it can crack the captcha
// Example $ocr->Read(’id’, ‘captcha img url’, ‘background colour of the image’);
$ocr->Read(’Test’, ‘captcha.php’, ‘FFFFFF’);
?>
If you get stuck add me to msn or email me and i can help: im-jst-me (AT) hotmail (DOT) com
Thanks Joe
Hello.
Good job. I know how it works. But i have a question…
First I must “train” the OCR with some images, and later I can check reading from images (captcha), Yes?
All is OK, but I can’t one thing.. First I train script with single letters, example “1″ and “a”. And later I want crack the captcha, which contains “1a”. And here is a problem… I don’t see a result.
Can I do it?
Please, help me
Damian
@joe:
Thanks!!!
You have to call the method Train() for each and every letter in the CAPTCHA image, let me give you an example:
$OCR->Train(’example-captcha’, ‘A’, ‘photoshoped-A-cropped.gif’);
$OCR->Train(’example-captcha’, ‘B’, ‘photoshoped-B-cropped.gif’);
$OCR->Train(’example-captcha’, ‘C’, ‘photoshoped-C-cropped.gif’);
and so on…
I hope it helps.
here is the one u couldn’t break, of course needed C to do it, some people use C# or dephli maybe, but here is a nueral network youtube captcha
There are some NeuralNetwork-Classes for PHP. Maybe you can combine NN with your script?
Alex thank you for your comment! I’ve been playing around with some ANN implementations for PHP, however I haven’t had the time to figure out how to bend it with my OCR library.
If you have any thoughts please share them.
Hey,
Your script looks great, I’m just having trouble trying to get it to work. I get an error : “Fatal error: Call to a member function on a non-object in /home/*****1/domains/**********.***/public_html/***/test.php”
when i try to train the class. My code is ”
Train(’tracer-l’, ‘D’, ‘letters/D.gif’);
?>
Any idea what could be causing this problem as I’d love to be able to use your very helpful class
Carroll, it seems to me that you haven’t created an example of a class, like this:
include(’ocr.php’);
$ocr= new OCR;
$ocr->train(blablabla);
Andrew
i’m verry bad with classes, can please say how to use this one, step by step
how to host it using XAMPP in WinXP? i try it with this code:
[code]
Train(’example-captcha’, ‘A’, ‘A-crop.jpg’);
$OCR->Train(’example-captcha’, ‘C’, ‘B-crop.jpg’);
$OCR->Train(’example-captcha’, ‘B’, ‘C-crop.jpg’);
$OCR->Train(’example-captcha’, ‘D’, ‘D-crop.jpg’);
$OCR->Train(’example-captcha’, ‘E’, ‘E-crop.jpg’);
$OCR->Train(’example-captcha’, ‘F’, ‘F-crop.jpg’);
$OCR->Train(’example-captcha’, ‘G’, ‘G-crop.jpg’);
$OCR->Train(’example-captcha’, ‘H’, ‘H-crop.jpg’);
$OCR->Train(’example-captcha’, ‘I’, ‘I-crop.jpg’);
$OCR->Train(’example-captcha’, ‘J’, ‘J-crop.jpg’);
$OCR->Train(’example-captcha’, ‘K’, ‘K-crop.jpg’);
$OCR->Train(’example-captcha’, ‘L’, ‘L-crop.jpg’);
$OCR->Train(’example-captcha’, ‘M’, ‘M-crop.jpg’);
$OCR->Train(’example-captcha’, ‘O’, ‘O-crop.jpg’);
$OCR->Train(’example-captcha’, ‘P’, ‘P-crop.jpg’);
$OCR->Train(’example-captcha’, ‘Q’, ‘Q-crop.jpg’);
$OCR->Train(’example-captcha’, ‘R’, ‘R-crop.jpg’);
$OCR->Train(’example-captcha’, ‘S’, ‘S-crop.jpg’);
$OCR->Train(’example-captcha’, ‘T’, ‘T-crop.jpg’);
$OCR->Train(’example-captcha’, ‘U’, ‘U-crop.jpg’);
$OCR->Train(’example-captcha’, ‘V’, ‘V-crop.jpg’);
$OCR->Train(’example-captcha’, ‘W’, ‘W-crop.jpg’);
$OCR->Train(’example-captcha’, ‘X’, ‘X-crop.jpg’);
$OCR->Train(’example-captcha’, ‘Y’, ‘Y-crop.jpg’);
$OCR->Train(’example-captcha’, ‘Z’, ‘Z-crop.jpg’);
echo $OCR->Read(’example-captcha’, ‘http://localhost/OCR/default.png’, ‘FFFFFF’);
[code]
?>
i host the file in
c:\program files\xampp\htdocs\OCR\ocrtest.php
c:\program files\xampp\htdocs\OCR\ocr-class.php
c:\program files\xampp\htdocs\OCR\[alphabet]-crop.jpg
and the result is the ocr-class.php code’s…confusing