Archive for the 'Security' Category

3D CAPTCHA

In a previous post, I talked about the limitations with CAPTCHA systems and proposed a partially-automated turing test to keep non-humans out.

I had a few discussions about this with a friend and he was more interested in the CG Kittens idea I very briefly alluded to in the main text. To summarise, the idea was that if the kittens in KittenAuth were 3D models rendered on the fly, you’d get an essentially unlimited number of images.

Well someone has gone and done something very similar. On a small site called SpamFizzle, there’s a description of a 3D CAPTCHA design that renders simple objects and asks the user questions about them. It looks like a great idea to me. It could require some fine-tuning, but I think the premise is sound.

I’d love to credit the author some more, but there are no details on that page. I presume it’s the same author as the only other page on the site, Michael G. Kaplan.

Regardless, it’s a great idea and worth a read if you’re interested.

Translink Fail

The Queensland Government recently introduced the Go card to provide a single intelligent ticketing mechanism for (almost) all public transport in South East Queensland.

The technology was developed by Cubic Transportation Systems and similar cards are in use all over the world.  The idea is when you get onto a bus or a train or anything, you touch your Go card to the sensor.  When you get off, you touch it again and the appropriate amount of money gets debited from your card balance.  Presuming it works, it’s a sensible system in my opinion.

Despite catching a bus to (not from) work nearly every working day, I had originally avoided the new system for a few reasons.  The main one was that it provided no financial benefit to me.  There’s a refundable deposit that’s payable when you buy a card, and the cost of an individual one-way ticket was the same whether you used the card or paid cash on the bus.  Discounts only came when you used it more than 6 times in a week.  I very, very rarely travel by bus more than half a dozen times a week.  In early August however, the fares will come down for the Go card only.  This makes it more attractive, so I went to purchase one.

The TransLink website provides an online web ticketing service that lets you purchase a card online.  Presumably they send it out to you but I didn’t get that far because frankly, I was too scared.  Let me show you.

After a couple of short screens asking you about the type of card you want, you come across this screen (click to enlarge):

TransLink Online Web Ticketing - First Screen

Notice the “Billing Account Question” at the bottom.  There’s no more information on what this is for, but I presume it’s some kind of verification question you have to answer in order to make payments or maybe changes to your billing details.  The default question is, “What is my name?”.  That’s probably the worst security question I’ve ever heard! Ok, I’m generous, so I’ll give them the benefit of the doubt here and assume that this isn’t used for anything important.  You can change it anyway, and if you’re sensible, you probably will.

Let’s look at the next screen:

TransLink Online Web Ticketing - Second Screen

The first thing I noticed was that there was another “Cardholder Question”.  Is this different from the other one?  Again, there’s no help available to tell you what it’s for.  At least the question is slightly more difficult to guess this time.  I wasn’t terribly concerned at this point, so I continued.

Here’s the next screen:

TransLink Online Web Ticketing - Third Screen

Now I’m quite concerned.  Firstly, it appears that despite this being a Queensland Government website, I’m suddenly being charged in pounds.  On one of the first screens, I was told that the charge was $5 so I could probably assume that they just got the currency symbol wrong, but this is a big deal.  What if I am going to end up paying the equivalent of just over $10? I had a look at the address bar to make sure I was still in the right place, and yes, it’s an Australian domain.  I’m growing more and more reluctant to sign up to this thing.  Of course by this stage, I’ve already given them my credit card details, and who knows whether they’ve been stored.

So next, I clicked on the terms and conditions link at the bottom of the page.  Here’s what the pop-up window said:

TransLink Online Web Ticketing - Terms and Conditions

So that’s it.  I’m done.  No way I’m going to buy online using a credit card from a site with that many problems. The other thing that the terms and conditions error showed me was that they appear to be using Lotus-Domino version 4.6.7aThe current stable version is version 8.  And does that “a” indicate an alpha version?  The Wikipedia page on Lotus Domino doesn’t even recognise the software before version 5, and the page on Lotus Notes suggests that version 4.6.7 was released sometime prior to 1999.  I’d hate to think what kind of exploits could be carried out on that server.  Colour me scared.

Now, I’m sure I could have continued on my merry way, bought the card, and everything would have worked out fine, but I wasn’t convinced that the transaction would work or even that my information was safe.  SSL or no, the currency problems and the information gathered from that error page just scare me too much.

To be honest, I’m not sure I’m comfortable buying the card at all any more.  The cards have to be registered, so I assume I have to give them some kind of personal information.  With web software that old, I simply can’t trust that it’s safe.

I certainly hope they sort all this out soon if they plan to decommission their other ticketing options.

Damo

Most Password Policies Are Bad

I want to preface this post with a couple of disclaimers.  When I talk about passwords in this post, I’m talking about the ones that matter.  I’m talking about your Windows Login password or your Internet Banking password, not the password you use for Facebook or Digg. Let’s be honest, nobody really cares enough about those accounts enough to break into them.  Also, I’m not going to address how the password is stored or the physical security of the resources.  Ideally, there’d be more than just a password.  However, for the purposes of this post, I’m assuming that the password is the only way in.

Now to my blunt opinion.  Most password policies are bad.  They fail spectacularly to solve the problem they’re meant to solve.

Usually, password policies (particularly in workplaces or universities) enforce a few rules.  Some of these rules are useful and some aren’t.  So called “standards” or “industry norms” are followed, but there seems to be little thought put into the password policy.

A large number of password policies I’ve come across follow these two rules:

  • The password must be technically complex, usually judged by a basic algorithm; and
  • The password must be changed frequently.

While they both seem sensible on the surface, I ask you to actually look at them in terms of the problem they’re supposed to solve.  It’s probably not much of a spoiler, but I’ll let you know that I have a problem with both of these rules.  They don’t effectively solve the problem they’re meant to solve.

First, let’s look at the problem a password is meant to solve.  Obviously, a password is meant to prevent unauthorised access to a particular resource; usually a computer, a network, or some software.  The assumption is that unauthorised access is bad and must be prevented, and don’t worry, I’m not going to argue with that.

So, looking at the first password policy rule listed above, does having a technically complex password solve the problem of unauthorised access?

The problem is that true complexity is very difficult to judge with software.  If a judgement is made on a character-by-character basis, it is essentially useless.  A password like Pa$$w0rD for example, will almost always be judged as technically complex, but would be relatively easy for a sensible password cracking algorithm to break.  It’s not enough to assume a password is good because it meets your easily-testable rulesI submit that the only good way to test whether a password is secure is to try your hardest to break it.

Enforcing a rule that makes sure a person has a certain number of characters and that they belong to several different character groups (uppercase, lowercase, numbers, symbols) means that a brute-force attack would probably be infeasible.  But as a software developer, if I really wanted to find someone’s password, there’s no way I’d be doing a blind brute-force attack.  I’d start with a dictionary of common words and I’d add things like the person’s date of birth, their partner’s name, their children’s names, their pets’ names - essentially any personal information that I could find.  I’d write some code to try things in a sensible way using these words prefixed or suffixed with numbers or other characters.  I’d try changing the letter O to the digit 0, i to 1, a to @.  Of course in the end, I’d probably just find a tool that did all this for me, but I wouldn’t just go trying random characters.

So does the first rule meet our goals?  Does enforcing 8 characters including three different groups guarantee a good password?  No it doesn’t.  You can set a terrible password and still follow the rules.

Onto the second rule - regularly changing the password.  Let me first say that this is a rule I hate.  Not only does it not meet the goal of protecting an account, it actually makes things worse when combined with the first rule.  Let me explain.

The reaction most users have when you tell them their password has to be (for example) at least 8 characters long and contain uppercase, lowercase, and numbers is not entirely positive.  It’s not always easy to think of a password that meets all of these requirements as well as the unspoken one; you have to remember it.  For obvious reasons, unless they’re instructed otherwise most people will choose something easy to remember like their surname followed by their birthday.  It meets the requirements and they can remember it.  To combat this, sensible administrators will explain how important it is that the password can’t be guessed and encourage another method of choosing a password.  They might suggest turning a sentence into a string of characters.  For example, the sentence “I’m going to try to remember this password” could become “ImG2t2RthP@ss”.  That’s a pretty good password - it meets all the rules, it looks very random, it will survive a dictionary attack, and most importantly it can be remembered.  Basically, it’s going to take a very long brute-force attack to guess.

Now, what if they know they’ll have to choose a new one to remember every month? Are they going to pick a hard password then?  I’d suggest that it’s far less likely that they’ll go through this process of turning a sentence into a password every month.  Even though it’s easier to remember than a random string of characters, it’s won’t stick instantly.  It might take them a few days before they can type it without thinking.  And if they have to do this every 30 days, it becomes that much harder to properly cement it in.

The user is thinking, “why do I have to change it?  My last password was good enough!”, and you know what?  They’re right.

The common argument is that a password should be changed frequently in case it gets compromised.  If someone discovers the password, they’ll only be able to access the system until it’s changed.  While this may be technically true, what does this really accomplish? How long does it really take for someone to do whatever they wanted to do by gaining access to your computer?  The fact is, as soon as someone else is able to access something they shouldn’t access using your password then that’s it - mission failed.  Are you really going to hang the strength of your security on the hope that the attacker won’t have enough time to do what they want to do before the password changes again?  It’s made even worse by the fact that the password is less likely to be a good one.  If I discovered that your current password is “January101182″ or “Smith111, it probably won’t take me long to work out your next one.

To summarise, a password policy like the one in the bullets above that enforces some level of technical complexity and makes the user choose another one every month fails in a couple of ways.

Firstly, enforcing technical complexity is the ultimate false sense of security.  It doesn’t force a user into choosing a good password.  In fact, combined with a rule enforcing frequent password changes, it encourages bad passwords.  Also, the warm fuzzy feeling the admin gets from knowing a password is at most 30 days old is fools gold.  The damage is done as soon as the password is discovered, and it’s more likely to be discovered with a 30 day turnaround because the password will be worse.

I know this is an essay already, but it would be unfair of me to end after attacking common practice without offering an alternative.

So here’s what I suggest:

  • Passwords should be technically complex, but they must not be easy to guess; and
  • Passwords should be changed at most once a year unless there’s a suspicion they’ve been compromised.

How do you achieve this?

Instruct each user to use a method like the the one suggested earlier; compressing a sentence to choose a password.  Make it hard - 10 or 12 characters in all four categories.  They won’t have to change it regularly so who cares if they take a while to remember it.  If they absolutely need to, let them write it down and keep it in their wallet until they remember it.  If they lose their wallet, then change the password.

Of course as I mentioned earlier, the only real way to test the strength of a password is to try to crack it.  So do that.  Set up a simple system that stores a hash of the person’s password, and try to break in using whatever means necessary.  In particular, look for personal information to work with.  If it’s too easy, you should be able to break in quickly, and if you do, make them change the password to something harder.

So there you have it.  My opinion on why most password policies are bad.  Sometimes you actually have to revisit the problem you’re trying to solve rather than just follow so called tried-and-true policies.

-Damo

CAPTCHA is Dead, Long Live PAPTCHA?

Slashdot today carries a link to a story claiming that the CAPTCHA algorithm for Hotmail (or Windows Live Hotmail or whatever it’s called now) has been defeated by a spambot and the exploits have started.  So that’s Gmail, Yahoo Mail, and now Hotmail.

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a great idea, but if it doesn’t work, then it doesn’t work.

CAPTCHAs were developed to tell humans apart from software.  They’re essentially a Turing Test across a very limited domain, and because of the limited domain, they’re much easier to attack.  In the case of a standard warped-text CAPTCHA, the attacker knows that the challenge will be an image with a certain number of letters and/or numbers, and that it will be warped in one or more ways.  The software can be written with this in mind.  Additionally, even if there is only a miniscule success rate, it’s often worthwhile for a spammer, particularly if attempts can be automated and run several times a second.

So what’s the solution?

Slashdot made a tongue-in-cheek reference to Kitten Auth, suggested in 2006.  It may have been a playful suggestion, but I think they’re on the right track.  Kitten Auth basically presents the user with a number of pictures of cute fluffy animals, and tells the user to select all the kittens.  The premise is the same as the text-based CAPTCHAs - easy for humans, hard for computers - but it doesn’t use text, making OCR useless.

Something like Kitten Auth could work as long as there’s no predictability.  If the same images are repeatedly used, a brute force attack would work.  If you needed to select three kittens out of nine pictures, all you need is one random success and bam, you have copies of three images that are kittens.  Given enough time, the software could learn enough images to be viable as a solution.

Alternatively, if OCR can be trained to learn letters and numbers that are very warped and modified, then why not pictures of kittens?  It’s harder, sure, but if we mere mortals can tell a kitten apart from a possum, then why not a computer? These spammers and malware authors are pretty determined you know.

So what else?

Maybe the problem with CAPTCHAs is the “CA” part.  Completely Automated.  What about PAPTCHA? Partially Automated. Sure, it ruins the contrived acronym, but it might be more effective.

Arguably, Kitten Auth is already an PAPTCHA.  The pictures of kittens can’t really be completely automated unless there are 3D models of kittens rendered from different angles with different lighting each time… hmm… that’s an idea… but I digress.

If Microsoft and Google and Yahoo were to put some effort into changing their “PTCHA” regularly, by real people, maybe there’s a solution.

Here’s how it could work:

  • Twenty people, armed with cameras, walk the streets for a few hours taking photos of random objects or scenery.
  • They get back to the office and upload the photos to today’s collection.
  • They link each photo to some standard questions (e.g. “what is the main object in this photo?”) and provide acceptable responses.
  • They provide additional specific questions for each photo (e.g. “How many white horses are there in the field?”) and provide acceptable responses.
  • One or more other staff members look at the photo and each question for quality control.  They can add more acceptable answers, remove them, or reject photos or questions outright.
  • Photos are retired after a time to prevent them being learned.

As a very rough estimate, I’d expect that a person would be able to add at least fifty photos with ten questions each every day.  With 20 people, that equals 10,000 new PTCHAs every day - 50,000 per working week. Surely that’d be enough.  Is 20 people too many?  Even with five people you’d have 12,500 new challenges every week.  If you expire the questions after a month, you’d still have an incredibly large number to choose from.

Current CAPTCHAs effectively have an infinite number of possibilities, however they’re still in a narrow domain.  By expanding the domain to include any question about any photo, there’s no pattern to learn - no possible algorithm to solve the problem.

Is it foolproof?  Definitely not.  However, I’d suggest that implemented properly (and that means a lot of QA), it would be a lot harder to break than current CAPTCHA methods.

There could be a business in this you know… I’d be interested to know what you think!

Damo

Edit: I’ve been having a discussion with a friend of mine who has outlined exactly why 50,000 new challenges per week is not enough.  In short, if x people are creating these challenges, then some fraction of x can be employed to decipher them (answering is quicker than asking).  The answers get added to a massive database along with copies of the images, and there’ll be enough solutions saved to give some malicious code a decent success rate.  If the image and question match one in the database, then the answer will be there.

Repetition of challenges is therefore a significant problem.  A challenge that presents an “image and question” that is repeated every 200,000 requests (4 weeks of 50,000 per week) is far too repetitive.  If the malicious code runs one request every fifteen minutes on 1,000 nodes, you’d have seen every challenge in just over 2 days.

So to overcome this, here are some ideas:

  • Use existing CAPTCHA technology such as warping the question text and putting it directly on the photo in a semi-random place.  You’d get no exact repeats.  The obvious problem is that this may still allow a malicious program to recognise sections of the photo that haven’t been altered.  With every photo and answer saved, there’s still a one in ten chance (given 10 questions per photo) of getting the question right.  Very unacceptable.
  • Warp not only the text, but the image as well.  Obviously it’d still need to be recognisable, so overlaying a random, semitransparent pattern or something might be all you could do.  It might be enough to slow down matching of the image though.
  • Include a bevy of questions that bear no relation to the image.  These could be added to any of the images.  For example, you could have a picture of a field of horses which renders with the question, “How many legs are most people born with?”

So now I have a system where a modified image is rendered with an overlayed warped-text question which may or may not have anything to do with the image.

Of course all I’m really doing is adding complexity, but as long as it’s complex enough to withstand attacks for the length of time it’s used (one month in my example), it should work.

My other suggestion, the CG kittens, got more interest.  In this case, there would be essentially no repeated images.  You’d probably only need a handful of animal models with a few variables set at random to make it feasible.  Perhaps fur colour, lighting, camera position, and some posture or face variables.

Teach your staff about BCC

Every now and then I’ll get an email from someone that’s been addressed to a number of people. I can tell how many and who they are because all of our email addresses are listed in the “To:” header. My email address has been sent to a list of people I don’t know.

There are a couple of reasons I don’t want my email address broadcast to the world.

Spam et. al.

I often don’t know many of the other people on the list, and more to the point, I don’t know anything about their computers. It’s possible, probably likely, that in an email addressed to 20 people, at least one of them has a fairly insecure computer and probably has at least one virus or a trojan. When the insecure computer receives the email, my email address going to be visible to this malware.

In short, I can have the best and most robust security on my computer, and I can ensure my email address is never published on the web, but all of this is useless if a single person sends my email address to a compromised machine.

Email addresses can be personal

I may be in the minority here, but I maintain a large number of email addresses. Having my own domains means that I can set up separate email addresses for job applications, friends and family, work, web enquiries, and so on. Right now, off the top of my head, I can think of about twenty email addresses that I use regularly. Most are simply forwarders, but they allow me to categorise incoming emails efficiently.

The other thing this lets me do is control communication. If I am wary about giving a company or a person my email address, I’ll create a new one. If I start getting emails I don’t want, or if for some reason I don’t want them to contact me any more, I can delete the email address.

Now you can see the dilemma. If I’ve given a particular email address to one company and they broadcast it to other people in a group email, I lose control.

Easily Fixed

If you’re reading this blog, I’d be surprised if you didn’t know about BCC, but I’ll summarise just in case.

When sending an email, you can put recipients’ email addresses into the “To:” field, the “CC:” field, or the “BCC:” field. “To:” and “CC:” behave the same, but “CC:” indicates that the person is being given a “carbon copy” - a legacy name from the paper days.

“BCC:” stands for “blind carbon copy”. These people will still receive the email, but the email addresses in this section will not be included in the header. They will be kept private.

The problem of sending everyone’s email addresses out with the email is obviously easily fixed. Just put all the email addresses in the “BCC:” section. For emails amongst groups of friends and family, it’s often not a big deal, but in business it’s frankly unprofessional.

Design flaw

People still don’t know about BCC. I sometimes feel compelled to educate the sender of a group email about BCC and the usual response is surprise. They usually aren’t even aware of this function.

I think the problem is deeper than just a lack of education. There are fundamental design flaws here. Now, email is old - Wikipedia claims that it’s been around since about 1965. So I’m not going to suggest any fundamental technical changes. Such changes would be infeasible in a system that a) works, and b) is older than the Internet.

A significant part of the problem is that it’s called “BCC”. What the hell does that mean to the average person? Even expanding it to “blind carbon copy” doesn’t really help - it doesn’t describe its behaviour.

“Carbon copy” is relatively easy to understand. It’s at least reasonably clear that the people in this section will be getting a “copy” of the email - it’s not directly addressed to them, but they’ll see it anyway. But what does “blind” mean? That it will be invisible? It’ll be transmitted in Braille? People won’t know what BCC does until they’re told.

The other problem I can see is that “BCC:” isn’t presented as a default field in many email clients. The main email clients I use are Outlook and Gmail. In both cases, “BCC:” must be explicitly turned on.

Solutions

  1. Change the name.
    “BCC” doesn’t mean anything - even when it’s expanded to “blind carbon copy” it doesn’t mean anything.
    Obviously, this change can’t be fundamental, but it can be cosmetic. If an email client changed “BCC:” to “Discreetly To:” or something similar, it might help with people’s understanding.
  2. Change the behaviour.
    In an office environment, group emails to external domains should, by default, include everyone in the “BCC:” field rather than the “To:” field. If that’s too extreme, it should prompt the user, suggesting that perhaps they don’t want to share the list of email addresses with all the recipients. At the very least, hiding email addresses from the other recipients should be a very visible option.

In the meantime? Teach your staff about BCC. Make sure they use it when it’s appropriate.

Damo