When it comes to computers, there is no such thing as a little bit different. Consider the following. “0010 1111” is how a PC identifies a forward slash “/”. Each one and zero represents a bit. Eight bits, as you may know, is a byte.  0010 1110 is one little bit different from 0010 1111, but it is a period “.” And 0010 0111, which is one bit different from a “/” is the right paren “)” and also is only one bit different than a forward slash.

So why does a little bit matter? On Saturday, January 31, 2009 a mistake was made at Google that lead them to flag every site on the internet with the message "This site may harm your computer". Google blacklisted the Internet! The mistake was that any site with “/” in the URL (web address) was flagged as bad and all URLs have a “/” in them. In this case, the least significant bit didn’t matter as a period also appears in every URL, but if it was a different bit then only sites with a right paren would have been flagged. The problem had to do with an entry into a database that is used to keep track of the bad web sites.

We all make mistakes sometimes, but in this case there it might make sense for Google to add a well known database best practice to the QC. Input validation is not a brand new concept when working with databases. The lack of input validation has lead to many security problems, especially with databases. Entire companies have been breached due to a lack of input validation. Input validation is at least one of the methods used to prevent an attack called “SQL Injection”. What is a little scary here is that you have a database that is very important and appears to be missing some basic input validation. In this case, perhaps Google should validate that the value entered into the specific field that allowed the “/” to wreck everything, is at least 2 or 3 or more characters long. Maybe there is a technical limitation, but if one character can flag every site on the Internet, it probably doesn’t make sense to allow a single character in that field.

I’m going to guess that some people at Google have already started exploring input validation since that incident! Usually when something like this happens, the team responsible is going to have to provide management with a corrective action report that details how they will make sure that they don’t repeat the same mistake. I would expect that for this type of thing to happen again, it will take…

…a little bit more than one byte.

Randy Abrams
Director of Technical Education