A small percentage of search engine users may view a web site using a search engine’s saved copy of site pages, their cached version. The cached copy the search engine serves to the user usually contains links to embedded objects present in the original site: images, CSS stylesheets, javascript, etc. Organizations focusing on web marketing activities, such as search engine optimization, will want to track all search engine activity, including cached page views.
Referrers from the search engine’s cached copy will show up in the site’s web server log files, including the keywords and keyword phrases used to find the cached copy. In some cases, the user will click through to the original website, viewing a real page with cache referring information in the web server log file.
Cache views are more difficult for Web Analytics software to recognize, but it can be done.
A Web Analytics tool must dissect the search engine referring URL as in this example:
http://64.233.179.104/search?q=cache:l5D4yOKeZaYJ:www.antezeta.com/search-engines-site-
localization-duplicate-content.html+google+dialect&hl=en&ct=clnk&cd=9
| Item | Description |
|---|---|
| http://64.233.179.104/ | A known Google IP address. |
| search | The Google Service. Others you may see include translate_c |
| q=cache:l5D4yOKeZaYJ: | Indicates a query, made to an item in cache. The cache ID is a 12 character alphanumeric string. |
| www.antezeta.com | Domain containing item matching query terms |
| search-engines-site-localization-duplicate-content.html | Object matching query terms (html page, pdf…) |
| google dialect | Query words entered by user |
| hl=en | Google Interface Human Language code (English) |
| ct=clnk | Not needed |
| cd=9 | Not needed |
In some cases, a user may view a search engine’s cached copy of a page without entering search words in a search engine. How? Through a search engine browser toolbar. Such a referrer will look like this example:
http://72.14.207.104/search?sourceid=navclient&ie=UTF-8&rls=GGLG,GGLG:2005-50,GGLG:
en&q=cache:http%3A%2F%2Fwww.antezeta.com%2Fawstats.html
We have added logic to the AWStats Web Analytics application Search Engine Recognition Module to better recognize Search Engine Cache query terms, page views and click-throughs to a site.
- Google Service IPs list has been increased. To do: find definitive list
- Introduced logic to parse search keywords. Currently only works for Google cache IDs without numbers. The main AWStats program will probably have to be modified to recognize alphanumeric cache IDs.
- Google Translate traffic is currently included in Google Cache traffic. Ideally, this would be separated out. It appears again that this will require a change to the main AWStats program.





Can you do this easily with facebook?