A Little Bit Different or “Google Got Bit”

When it comes to computers, there is no such thing as a little bit different. Consider the following. “0010 1111” is how a PC identifies a forward slash “/”. Each one and zero represents a bit. Eight bits, as you may know, is a byte.  0010 1110 is one little bit different from 0010 1111, but it is a period “.” And 0010 0111, which is one bit different from a “/” is the right paren “)” and also is only one bit different than a forward slash.

So why does a little bit matter? On Saturday, January 31, 2009 a mistake was made at Google that lead them to flag every site on the internet with the message "This site may harm your computer". Google blacklisted the Internet! The mistake was that any site with “/” in the URL (web address) was flagged as bad and all URLs have a “/” in them. In this case, the least significant bit didn’t matter as a period also appears in every URL, but if it was a different bit then only sites with a right paren would have been flagged. The problem had to do with an entry into a database that is used to keep track of the bad web sites.

We all make mistakes sometimes, but in this case there it might make sense for Google to add a well known database best practice to the QC. Input validation is not a brand new concept when working with databases. The lack of input validation has lead to many security problems, especially with databases. Entire companies have been breached due to a lack of input validation. Input validation is at least one of the methods used to prevent an attack called “SQL Injection”. What is a little scary here is that you have a database that is very important and appears to be missing some basic input validation. In this case, perhaps Google should validate that the value entered into the specific field that allowed the “/” to wreck everything, is at least 2 or 3 or more characters long. Maybe there is a technical limitation, but if one character can flag every site on the Internet, it probably doesn’t make sense to allow a single character in that field.

I’m going to guess that some people at Google have already started exploring input validation since that incident! Usually when something like this happens, the team responsible is going to have to provide management with a corrective action report that details how they will make sure that they don’t repeat the same mistake. I would expect that for this type of thing to happen again, it will take…

…a little bit more than one byte.

Randy Abrams
Director of Technical Education

Author , ESET

  • http://www.sbg.ro Bogdan

    Hi there,
    I think Google prepares something for the large audience because at their level you don’t do such thing not even as a mistake. From my point of view it’s better this way. I can make all my websites I have in portfolio to respect that rule and never show any slash ever, using frames or iframes or even dhtml. Easy :)

    Thanks for article.

Follow us

Copyright © 2016 ESET, All Rights Reserved.