Breaking CAPTCHA with automated humans

We’re all familiar with CAPTCHA right? That impenetrable fortress of crazy squiggly characters that only a real human can decipher. Whilst they tend to drive us a bit nuts, they do actually provide a valuable function in that they prevent the automation of requests against online services. For example, you can’t get yourself a Google account without first wrapping your head around what on earth this one says:

CAPTCHA before creating a Google account

Why does Google do this? Well, once you create yourself a Google account you’ve now got GMail and G+ and all sorts of other platforms which could be used to perform such nefarious activities as generating spam, distributing malware or creating false identities. Once you can automate this, these activities can be performed en masse.

It’s a similar deal with Western Union:

Western Union's CAPTCHA

It’s easy to imagine that being able to automatically create accounts at a financial institution might open the gateway to all sorts of monetary shenanigans. And if you’ve already been able to create a GMail account, you’ve got everything you need to begin batching the creation of identities at Western Union.

So CAPTCHA prevents all this, right? Only humans can break the code and complete these signup processes, right? But what if we could automate the humans; I mean what if we could take CAPTCHAs and solve them at such a rate that these registration processes could be easily automated? Well it turns out you can and it will only cost you a couple of bucks.

Commoditised CAPTCHA solving with Antigate

The inspiration for this post goes back to a piece I read recently from renowned security writer Brian Krebs about Virtual Sweatshops Defeating Bot-or-Not Tests. In the article, Brian talks about how CAPTCHA solving is being outsourced by spammers to services in low cost markets which use real people to churn through large numbers of the obfuscated text for a fraction of a cent each time; we’re talking up to $1 for 1,000 CAPTCHAs. That’s right, one tenth of one cent to solve a CAPTCHA. The service is consumable via an API which conceivably means the spammers’ automated scripts can simply pass the CAPTCHA off and get a speedy response containing the actual text which can then be used in their spamming efforts.

The article talks about which caters to those wishing to join the team of elite CAPTCHA crackers. They’re an impressive looking team too: CAPTCHA crackers

Personally, if I was a spammer, I’d be comforted by the knowledge that my CAPTCHA crackers were adorned in jacket and tie and overseen by an attractive supervisor in a pants suit. But I digress.

I thought it would be interesting to actually take up this service and see just how efficient outsourced CAPTCHA cracking can be. Whilst might be the coalface for would be CAPTCHA crackers, if we want to consume the cracking service we need to sign up with Antigate over at (they both run off the same subnet):

Antigate's welcome screen

Antigate offers a very attractive pricing structure with payment posible in a number of different popular formats. However, it’s important to note that the service cannot be used for spam operations lest it generate “butthurt”. I wasn’t entirely sure what this meant, but after Googling it I decided I was probably best not to participate in the butthurting and it would be safer just to build my own little test app instead.

Registration is pretty much like registering for any old service:

Registering for Antigate

In fact it’s so much like many common registration services that there’s no SSL. And it doesn’t like strong passwords. And it sends me back my nice 1Password generated crazy string of characters:

Antigate disallowing a strong password

Password toned down a bit, everything seems to be ok, I think:

Russian repsonse message after registration with Antigate

Once logged in, everything looks pretty straight forward with nice quick start options. What I really wanted to do was write my own code which is fortunately one of the options:

Options for consuming Antigate's service

Wow, they’ve even got a marketplace! Maybe another time, for now I just want to start solving CAPTCHAs. Looking at the API, the process basically goes like this:

  1. Send the CAPTCHA and API key via an HTTP POST request
  2. A response is returned with an ID
  3. Wait 10 seconds then send the ID back in another request
  4. A response with either the resolved text or a “not ready” status is sent back
  5. If it’s not ready, wait 5 seconds then ask for the status again (rinse, lather, repeat)

It seems like a very good use case for long polling. Never mind, the API is straight forward, the interesting bit will be the success rate achieved.

Before we move onto the crux of this post – actually “breaking” CAPTCHA – there’s one more useful service provided by Antigate and that’s the ability to get a quick health check on the status of the operators. Actually, you can load this yourself without authenticating and you should see something similar to this:

Current status of Antigate operators

It makes a whole lot more sense when you see the underlying XML in the source:


This makes it very convenient to work out when to load up the service with requests to solve CAPTCHAs. More on that later.

Building a CAPTCHA enabled site to break

Of course none of this is going to be very interesting if we don’t have a CAPTCHA enabled site to start breaking. To avoid the butthurting, I created an ASP.NET MCV 3 website and grabbed the Web Helpers Library from Microsoft. The web helpers make it very simple to drop a CAPTCHA into any page and then validate it on submission.

Actually, it’s a reCAPTCHA implementation and the distinction is important; acquired by Google a few years back, reCAPTCHA is designed to assist in the digitisation of text books so this exercise is going to make the world a slightly better place by hopefully making more information available to more people. Plus there’s the fact that Google serves about 200 million of them every day so it’s a good high-profile implementation and reflective of what Antigate’s service is probably being used to solve already.

Moving on, I built a typical registration form as follows:

Target site registration form

What’s really important here is how the CAPTCHA renders to HTML; we need to understand this in order to actually download the image from Google, send it to Antigate then submit the correct form values with the registration. Keep in mind that everything we’re about to look at is easily available in the HTML source of any site implementing CAPTCHA.

Taking the registration form from above, the CAPTCHA is embedded via the following markup:

<script type="text/javascript">
  var RecaptchaOptions = { "theme": "Red", "lang": "en", "tabindex": 0 };
<script src="
type="text/javascript"></script> <noscript> <iframe frameborder="0" height="300px"
width="500px"></iframe> <br /> <br /> <textarea cols="40" name="recaptcha_challenge_field" rows="3"></textarea> <input name="recaptcha_response_field" type="hidden"
="manual_challenge" /> </noscript>

The simplest way to look at this is to focus on the content in the <noscript> tag which is what’s going to be parsed if the browser doesn’t support JavaScript (or has it turned off). This saves us from dealing with all the logic in the external script files which is otherwise used to embed the CAPTCHA image in most browsers.

The important bit is the iframe source which is where the image will be embedded. In this case, you can see the path is

This page will render a very basic CAPTCHA implementation – remember that this is the one intended for folks without JavaScript:

CAPTCHA contents of the iframe embedded in a form

The CAPTCHA image is different from the earlier one in the form as we’ve loaded the iframe twice which has caused it to refresh; once when we loaded the registration page it then again when I loaded the iframe separately for the screen grab above. When I automate this in the next section it will only be loaded once.

Inspecting the source of the iframe page, we can easily find the CAPTCHA image embedded in the markup:

<img width="300" height="57" alt="" src="image?

And of course we can then extract the actual URL of the image itself. This is the whole point of the exercise as this is the guy we need to send off to Antigate:

The last thing we need to know is how to construct the form submission to the target site. Obviously this will include values such as the name and address along with the solved CAPTCHA, but there’s a little bit more to it than that. Let’s take a look at what gets submitted by watching the HTTP request with Fiddler:

Form fileds submitted to the target site

What we see here is a bunch of form fields into which I’ve just entered “aaa” then two CAPTCHA related fields: a challenge and a response. The challenge is simply the query string parameter from the CAPTCHA image above and the response is obviously the solved CAPTCHA. We now know everything we need to build the CAPTCHA cracker.

Building the CAPTCHA cracker

Antigate try to be helpful here and provide a little C# sample to get you started. I ended up rewriting it myself for both brevity and to ensure I understood exactly what was going on. Plus, of course, it actually needs to automate the form submission in our CAPTCHA enabled app which is naturally a bespoke requirement.

I ended up with a console app which does this:

Sequnce diagram of how CAPTCHA will be circumvented

In short:

  1. Request the registration page from the target site
  2. Request the iframe source used to embed the CAPTCHA image
  3. Request the CAPTCHA image used in the site and save it locally
  4. Send the CAPTCHA to the Antigate service
  5. Antigate assigns the CAPTCHA to a human operator who then solves it and sends it back to them
  6. Wait 10 seconds, then check back with Antigate for the CAPTCHA text (repeat every 5 seconds until solved)
  7. Send a registration to the target site with the fields filled out (I’ve just defined a static set of sample data) plus the CAPTCHA challenge and solved text

After all this is complete, I’ve also added some logging because I want to track things like throughput and success rate plus the duration of each stage of the process. The success of the process is determined by the response from the form submission; obviously if you get the CAPTCHA right you’re going to receive a very different response body to if you get it wrong.

That’s it – it’s really that simple. But this isn’t a free service so we’ll need some credit before proceeding.

Topping up the Antigate account

The last thing we need to do before the serious CAPTCHA cracking begins is to put some money into the Antigate account. They make things rather easy by delegating the financial bits off to Avangate who sell refill codes for various values:

Purchasing refills for the Antigate service

Avangate is a pretty renowned e-retailer of software products which usually means you’re purchasing license numbers. Over on their site, the (currently) strong Aussie dollar means we’re looking at 96 cents to break 1,000 CAPTCHAS. Nice:

Avangate's purchase page

Payment is via PayPal and I went through the usual steps to authorise it after which I was sent back to Avangate:

Receipt for purchasing $1 worth of Antigate CAPTCHA cracking

A short while later and the code is conveniently delivered via email:

Activation key for th Antigate service

And… we’re up and running:

Successful topup of the Antigate service

Right, now the really interesting bit begins.

Breaking CAPTCHA

Let’s run it up! I’ve added quite a bit of output verbosity which was very useful during the build process:

Running the CAPTCHA cracker

Here we can see the iframe source path followed by the CAPTCHA image path and then the query string extracted from it (remember, this is the challenge we need to submit with the form). The image is then saved locally, submitted to Antigate and a response with the ID returned which in this case is 42244161. You can see the process then sleeping for 10 seconds followed by a total of three more requests, each five seconds apart, until a response is returned with the text “mungo odatesp”. This is the first “Wow!” moment; a human somewhere has actually solved this and sent it back to me!

But of course the real proof of success is once the form is submitted. The second last line of text shows this has returned “Ok” so the form has actually returned a response body consistent with a successful registration. Lastly, the entire process took just over 27 seconds and the CAPTCHA cracker also logged the process successfully:

Log entry after automating the registration

In this case here, because I also control the website we submitted the registration to I can do a sanity check and make sure the registration was actually submitted:

New user registration in the database

Yep, looks right! This has all the sample data I configured the CAPTCHA cracker to post and the CreateDate on the record falls just after the CaptchaCompleteDate in the log. So there you have it – successful programmatic CAPTCHA circumvention using an automated human. Problem is, 27 seconds is not exactly blistering. But there’s a better way to eke out performance and it’s something programmers have known about for a long time: multithreading.

Employing the multithreaded humans pattern

What if we start multithreading the humans? I mean rather than just running up a single instance of the CAPTCHA cracker, how about, say, 30 simultaneous instances? Of course the success of this model depends on having 30 operators who are able to simultaneously work on what is in essence, a sequential process (one operator can only solve one CAPTCHA at a time). But as we saw earlier on, it’s not uncommon to have 50 operators on hand.

So I implemented a “poor man’s multithreading” and fired up 30 separate instances of the CAPTCHA cracker console:

30 instances of the CAPTCHA cracker running simultaneously

I let this run for 20 minutes then analysed the results which as we’d expect, show a much higher throughput:

CAPTCHAs cracked per minute (30 threads)

A total of 1,230 CAPTCHAs were sent off to Antigate and only 77 were not solved correctly hence causing the registration process to fail. That’s a 94% success rate:

Success rate of CAPTCHAs solved

But even though multithreaded, the process of solving the CAPTCHAs was still a huge bottleneck in automating registrations:

Stacked bar graph showing CAPTCHA cracking consuming a large amount of time

In fact the numbers broke down to 420ms from start to where the CAPTCHA image was ready for sending off then 26 seconds to actually get a response back with the solved CAPTCHA followed by 199ms to submit it to the registration form along with the other fields. Clearly CAPTCHA still puts a significant dent in the overall duration of the process; that’s nearly 98% of the total submission process it’s chewing up there.

But of course you can run almost limitless threads (depending on the available humans) and the bottom line is that I was able to break through the CAPTCHA process and automate registrations at a rate of one per every 0.98 seconds and with a 94% success rate. This has well and truly demonstrated that the intent of CAPTCHA can indeed be defeated by simply automating the humans.


I must admit, I do feel a bit sorry for the folks sitting there endlessly solving a never ending stream of CAPTCHAs; frankly, just one drives me a bit nuts! But what must have been even worse – and I need to take some blame here – is that while testing I kept submitting the same CAPTCHA over and over and over again. I can picture the poor operator sitting there thinking “WTF is this guy doing already?!” Then again, maybe they made some quick bucks because recognising the same pattern time and again becomes more efficient.

When I first got the script running, I just couldn’t help but fire it up over and over again. Frankly, I found it a bit mindboggling to think that each time I ran it, that painfully obtuse little CAPTCHA was flying around the world and being solved by someone for whom $0.001 was a worthwhile effort and the result efficiently delivered back to me within a matter of seconds. There’s something beautiful about the efficiency with which that happens.

Of course the other question all this raises is the legality of a service such as Antigate. On the one hand, they’re just converting random bitmaps to text which in and of itself, is probably no big deal. But it may also be no big deal in the same way that Napster and Megaupload made it possible to share files; it could well come down to the implied (or assumed) intent of the service. At the end of the day, resolves to an IP in Florida so assumedly if they were coming foul enough of the law, action such as we saw with Megaupload last week would not be too difficult, I mean it’s not like they’re secreted away in deepest darkest Eastern Europe or anything.

The other thing worth noting is that Antigate aren't the only guys out there providing this service; Death By CAPTCHA offers a very similar service as do Bypass CAPTCHA and Beat CAPTCHAs; this isn’t exactly ground breaking territory. Then availability of all these services could make it very easy to stand up a “clustered humans” model whereby the process I went through above is repeated simultaneously across multiple services hence dramatically increasing the throughput.

Now of course none of this is actually breaking the CAPTCHA implementation; the sanctity of the squiggly words has been retained and indeed it has taken real live humans to resolve them into plain text. But what this exercise does show is that the assertion that CAPTCHA prevents automation is just plain wrong, all it takes is for part of the automation to be moved from computers to humans. Consider this against Wikipedia’s definition and it would be fair to say that this exercise has undermined the very security premise on which CAPTACH is built:

The basis of the CAPTCHA system is to prevent automated access to a system by computer programs or "bots"

It’s an odd position to wrap the post up on; I mean we’re so accustomed to putting our emotionless, highly efficient PCs to work to save us humans from labour intensive exercises. But what this post shows us is that sometimes we need to invert the process and instead automate the humans to the extent where they can perform at high levels of efficiency. It just takes some clever orchestration and enough humans that are willing to do the work cheaply enough to make the exercise financially viable.

Tweet Post Update Email RSS

Hi, I'm Troy Hunt, I write this blog, create courses for Pluralsight and am a Microsoft Regional Director and MVP who travels the world speaking at events and training technology professionals