I spy with my little eye…
When Google Trends for Websites launched, Google Analytics Evangelist Avinash Kaushik noted it (and Google Ad Planner) might have a searcher’s bias. If one accepts that this might have been the case in 2008, it is highly unlikely to be the case today, when one considers the multiple ways Google can track internet usage. The list which follows aims to illustrate this point. I put it together as a response to clients who were hesitant to use Google Analytics as for whatever reason, they were reluctant to let Google to know everything about the use of their websites.
The reality is that resistance is futile: through services like Google Search, the Google Tool Bar and the Google Public DNS, Google already knows a lot about most websites.
Some of the data collection points listed below are technical in nature – those with a marketing background might want to ask a web colleague with a technical background for clarification if needed.
- Google search. Google search is the only relevant search engine in most of the world, with a few notable exceptions (most particularly in the US, Russia, China, Japan and Korea). Google tracks the sites users select (just look in Google Webmaster Tools for the proof) and can track how much time passes before a user returns to refine their search should they decide they didn’t like the selected result.
- Google Analytics. Google has stated that over 10 million sites are using Google Analytics. Search Engine Blekko found 12 million domains running Google Analytics while Built with found 15 million sites. Google Analytics not only tracks a specific site, it also tells Google what sites sent traffic to the tracked site! Officially Google won’t use the data it collects from a site for other purposes unless a site owner opts in. Google Analytics users can publish their exact Google Analytics measured traffic values in Google’s Ad Planner by following a two-step process.
- Google AdSense. Google will have page view data for any page which includes Google AdSense code. Blekko found 434,879,314 URLs distributed over 3,718,507 domains.
- Google Web Fonts. Free Google hosted fonts can be embedded into web pages, an elegant way around the problem of few web safe fonts. This provides Google with another data point as Google can track which pages include Google Web Fonts In February 2011 Google said that 800,000 sites used Google Web Fonts and that number was growing 30% each month.
- Google code libraries hosting. Google hosts many popular code libraries which are included in dynamic web pages, such as jQuery. While Google provides this as a public service to make the web faster, it also has another data collection point.
- Google Public DNS Google offers a free service to translate the internet addresses we use into numbers that computers understand. Generally this type of phone book service is provided by a user’s internet provider but Google says theirs is faster. In February 2012 Google stated that they process 70 billion requests a day.
- Google safe browsing malware warning system is used by 600 million Chrome, Firefox, and Safari users. Apparently as of API version 2, Google does NOT know which URLs being viewed as the URLs are “hashed”.
- Google Toolbar with extra features enabled, such as Page Rank display. To display a page’s Page Rank, Google needs to know which page is being viewed.
- Google+1 social button embedded in web pages. Blekko found 15 million pages on a total of 450,000 sites contained Google’s +1 social button.
- Google Reader RSS feed aggregator Google knows which site feeds and which articles are read based on user clicks.
- Feedburner RSS feed analytics service. Google has data, albeit imperfect, on the number RSS feed subscribers for sites using Feedburner.
- goo.gl URL shortener. Google has some click data for any link shorted using Google’s URL shortener service.
- Gmail web client. Google may know about links shared and clicked in email messages sent to Gmail and Google Apps users. I’ve made no attempt to investigate this.
- Google docs. Documents, spreadsheets and presentations may contain links which Google could monitor. I’ve made no attempt to investigate this.
- Track search engine traffic from blekko in Google Analytics
- Google Trends for Websites, now with less data
- Web Text as Images: SEO & other problems overcome with @font-face
- Former Urchin Web Analytics Reporting Service now free as Google Analytics
- Search engine visibility: one more reason to participate in social networks