An unofficial blog that watches Google's attempts to move your operating system online since 2005. Not affiliated with Google.

Send your tips to gostips@gmail.com.

February 12, 2008

Google Toolbar and 404 Error Pages

I find it very strange that people have abnormal reactions when Google does something. People have an incorrect perception of the "don't be evil" mantra and like to say that Google doesn't respect it every time Google does something debatable. I didn't hear too many people complaining that Internet Explorer replaces default 404 error pages with its own page, but when Google Toolbar does that, it suddenly hijacks web sites.

Let's take a look at a simple example of a site that doesn't have a custom 404 error page (they're very hard too find these days, so most sites won't fall in this category). If you try to go to news.speeple.com/sunflowers, here's what you see in IE7: a page with useful suggestions like "Retype the address" or "Go back to the previous page".



This is actually a page created by Internet Explorer and you can disable it in the advanced settings, by unchecking "Show friendly HTTP error pages". Here's the page returned by the server, which is displayed in most browsers (Firefox, Opera, etc.):



The latest version of Google Toolbar has a feature disabled by default that replaces IE's error pages with more useful suggestions: the site's homepage or subdomain, some search queries that could help you locate the right page. The idea is that you probably clicked on a bad link or the page was relocated without using a redirect. In this case, Google's query segmentation is not perfect, but it usually does a pretty good job at transforming a URL into an useful query. To obtain the suggestions, the toolbar sends the URL to Google's servers, so this feature has privacy implications. More exactly, the suggestion page is obtained from:

http://linkhelp.clients.google.com/tbproxy/lh/fixurl?sourceid=navclient &hl=en&sd=com&error=http404&url=http://news.speeple.com/sunflowers


Google Toolbar only displays that page for default error pages (that have less than 512 bytes), DNS errors and connection failures. The feature can be enabled from Google Toolbar's settings by checking "Browse by name in the address bar", a feature that also performs searches when you enter keywords in the address bar.

So which of the three pages is more helpful for someone who ends up on a non-existing page from a site that didn't bother to create a custom 404 error page?

Related:
Matt Cutts' reaction
Google tries to fix broken URLs
Browsing the web using Google Cache

27 comments:

  1. It could be useful for the fact of Google Cache. This is a bit on the invasive side, though.

    ReplyDelete
  2. Hey
    I really like your blog but I think you might be missing a minor point here.
    (I don't have a windows install so I might be wrong. If so, I am very sorry and please delete this entry)
    IE7 will just display some tips what to do (I suppose page is stored locally in some dll. While google will try to get you to search. So google is trying to push it's search box everywhere. I think if google would have just replaced the text with a little more helpful information and cut out it search box and stop trying to put it's logo everywhere, people would not be so concerned. Further is this a standard (locally saved) page or what does google transfer to display these links. And does it log the clicks somehow.

    ReplyDelete
  3. I don't think the purpose is to search, but searching is one of the ways to find that page, in case it exists. The first option is to go to the site's homepage and find the page from there. If the site has a sitemap, Google also links to it.

    Unfortunately, the page is not generated locally because segmenting the URL is not an easy task and because the suggestions are dependent on the URL. So you're sent to a web page:

    http://linkhelp.clients.google.com/tbproxy/lh/fixurl?error=http404&url=URL

    You can test the link above with different URLs.

    Maybe Google should provide a separate option for this feature and include more explanations. Overall, I think it's useful and you'll not see the Google-generated page very often.

    ReplyDelete
  4. But... what happens if a webmaster has set up a particular page to display in case of a 404 error ? ie :
    http://pages.ebay.com/sefgzergheqarg
    or
    http://www.amazon.com/fzgegze
    ?
    That's when Google's option is a probelm (if it DOES redirect to a Google page in that case)

    ReplyDelete
  5. If the webmaster has set up a custom error page, you'll see that error page. Google Toolbar follows the same procedure as Internet Explorer: if the error page returned by the server has less than 512 bytes, it's a default error page and they replace it. Otherwise, the user sees the custom error page.

    Most sites have a custom error page (including Amazon, eBay, Google, Yahoo, Facebook etc.) By the way, do you know an important web site that doesn't have custom HTTP error pages?

    ReplyDelete
  6. I found some sites that don't return custom error pages:

    http://www.hi5.com/hjhkhjh/jkljlj
    http://www.myspace.com/jkljl/jkjkjlj (the IIS error page is bigger than 512 bytes)
    http://rapidshare.com/kjljlkj/jkjljkl
    http://www.baidu.com/kjljlkj/jkjljkl
    http://www.imageshack.us/kjljlkj/

    ReplyDelete
  7. Are you sure you are not from Google?
    Your views are always too biased.

    ReplyDelete
  8. The only point of this post was to tell that if you think Google Toolbar hijacks 404 error pages, then Internet Explorer also hijacks them (both replace those pages with something else). I don't know if I'm biased, but I've always tried to tell what I think:

    * Google is your default search
    * Google forces you to install Google Pack
    * Froogle Checkout
    * Search, no longer the main feature of Google Desktop
    etc.

    ReplyDelete
  9. The "Google hijacking" and the "IE hijacking" are of very different nature. Google is collecting valuable information from their users. IE offers a local feature. There is no "spying" from MS.

    ReplyDelete
  10. i dont think that purpose is to search.we get pages which we are looking for to got to its home page.
    iam a new bee and iam not very good at finding out page.if link is brokenthenno way i can fing out what iam looking for

    ReplyDelete
  11. Important update. The feature is not enabled by default. I've uninstalled/reinstalled the toolbar and the feature was disabled. The custom error pages are part of the "browse by name", although Google doesn't explicitly mention this.

    ReplyDelete
  12. Yet another example of Google changing site content, less than 512 bytes or not, it is not what the site owner wants. This shows no respect by Google for content producers as usual. Not quite as evil as adding links to a page which clearly is facilitating the creation of an unauthorized derivative work. Hopefully one day someone will stand up to their masses of lawyers and start wiping their ass in court.

    ReplyDelete
  13. As already mentioned, Internet Explorer (the browser used by more than 70% of the people) doesn't respect the webmasters either. If webmasters cared about their users, they would create custom error pages with alternate links, site map, search box etc. By displaying:

    "Not Found

    The requested URL was not found on this server."

    you're not very helpful.

    ReplyDelete
  14. >>>Important update. The feature is not enabled by default.

    I just installed it and it was enabled by default.

    ReplyDelete
  15. @Justin:
    That's strange. If I click on "Restore defaults", Browse by Name is disabled. The same happens when I uninstall/reinstall the toolbar. Maybe you've already had an old version and Google Toolbar 5 preserved the settings.

    --> Screenshot

    ReplyDelete
  16. I took the url you posted in comment 3 and used a non-existent url with the base url = my website (my website gets no respect from Google anyhow).

    error=http404&url=www.egorg.com/library.html

    When the error appears, nothing having to do with my website appears. The search term offered isn't even from my base site; it's treated like an anagram.

    I know I'm in google's db. I even pay adwords to advertise the site. At least, please, offer the option of going to my home page.

    ReplyDelete
  17. "I find it very strange that people have abnormal reactions when Google does something."

    Aside from the "Don't be evil" mantra, how many major corporations have actually created an aura of trust and corporate responsibility?

    Why are Google and MS treated differently when doing seemingly the same thing? Many users trust Google (right or wrong) and expect the worst from MS.

    People enjoy Googles customer centric approach and are quick to keep the organization on its toes. The additional attention should be appreciated.

    ReplyDelete
  18. Yep, What happens when site has already their customized 404 page?

    ReplyDelete
  19. I have a problem on my 404 page. This is the only page i have the google search and when a user finds the 404 page the google search box is pre-populated with the text "404". I read this to be a server setting some where but now i can find the article. Suggestions??

    ReplyDelete
  20. I have the problem with google crowl, I can't find 404 page on my sity, but google said: I have one 404 page.....

    ReplyDelete
  21. Google Error


    We're sorry...
    ... but your query looks similar to automated requests from a computer virus or spyware application. To protect our users, we can't process your request right now.

    We'll restore your access as quickly as possible, so try again soon. In the meantime, if you suspect that your computer or network has been infected, you might want to run a virus checker or spyware remover to make sure that your systems are free of viruses and other spurious software.

    If you're continually receiving this error, you may be able to resolve the problem by deleting your Google cookie and revisiting Google. For browser-specific instructions, please consult your browser's online support center.

    If your entire network is affected, more information is available in the Google Web Search Help Center.

    We apologize for the inconvenience, and hope we'll see you again on Google.
    To continue searching, please type the characters you see below:

    ReplyDelete
  22. This comment has been removed by the author.

    ReplyDelete
  23. Out of all the toolbars I have tried I like Google's the best. I don't use any other search engine anyway. Really, why would I. Most webmasters do their concentration of SEM to Google anyway. I understand all the babble about privacy, but if Google getting a little bit of data that helps them to make Google search better then I am for it, plus I see my stock go up :)

    Besides, I am sre IE will have their own version soon enough once they realise they can profit somehow from it.

    ReplyDelete
  24. This is an extremely helpful article, I'm in the process of upgrading the HTTP Error Handler plugin for WordPress (http://wordpress.org/extend/plugins/askapache-google-404/) and got some nice ideas on how to make it even more robust. As far as not being evil... everyone and their mothers are now doing this type of hijacking, it's becoming the new malware since everyone from yahoo, bing, anti-virus companies, DNS companies, etc.. are after search engine revenue.

    So far Google hasn't been too bad, we are all lucky they have such a responsible company.

    2c

    ReplyDelete
  25. yeh i noticed all the broad band companies do that now! im in the uk and noticed virgin replaces their own search page if you get it wrong! easy htaccess init!

    ReplyDelete
  26. If webmasters build websites properly in the first place with a customised 404 page, surely it would be better for all

    ReplyDelete
  27. Problem in IE with custom 404 page with over 512 characters and Google Toolbar and "Provide suggestions on navigation errors" turned on.
    Instead of redirecting to Googles own page, it seems that the toolbar strips 512 charactersm, no matter what, and this way distroying the custom 404 page.

    http://codeigniter.com/forums/viewthread/222803/

    ReplyDelete

Note: Only a member of this blog may post a comment.