Web Analytics: Embedded JavaScript Page Tracking Code vs. Web Server Log Files

Web Analytics tracking choices with advantages and disadvantages

Basic Web Analytics tools usually fall into one of two categories:

  • Web server log file based
  • JavaScript embedded page tags

Both have advantages and disadvantages.

By default, server logs contain much richer data than that usually tracked by JavaScript page tracking. For organizations focused on search engine visibility, web server logs show which pages have been crawled by each search engine crawler – and how recently.

  • JavaScript page tracking code does not trigger when a page is downloaded by automated robots. Proponents of JavaScript systems tout this as beneficial – their systems only track human activity. This is really just putting a brave face on a limitation. Better web log file analysis systems are able to separate human from non-human traffic.
  • JavaScript systems also fail when HTML programmers forget to embed tracking code in newly written pages. No tracking code, no data.
  • JavaScript systems can slow down your pages if the tracking server is bogged down – this has been the case of Google’s free hosted analytics service.

JavaScript code must be added to each link in order to track non-HTML media, such as word processor documents, pdf files and images with JavaScript based Web Analytics systems.

  • Proponents of JavaScript based systems often say that their page tracking is more accurate – their code is always executed when a user views a page. Implicit in this affirmation is the idea that web caching is preventing a call to the site’s web server, thus, the page view doesn’t get logged. This is a myth. Part of web server log based web analytics system configuration is insuring proper web server cache directives. Simply telling a user’s web browser to check the server to see if a page has been modified results in accurate web log file data with minimal extra overhead. It also results in a more accurate browsing experience.

As some users (not too many, it is true) disable JavaScript, many JavaScript based systems also include <noscript> tag logic which will still capture minimal data.

IT Managers like JavaScript systems because web log files can be onerous to manage. The logs get very big very quickly for highly trafficked sites. A few corrupted lines in the middle of the file (usually due to attempts exploit Windows buffer vulnerabilities) can stop lesser web log file analysis systems in their tracks. A junior systems administrator won’t usually have the necessary command line experience to find and remove the offending lines. Web Analytics systems are by their nature resource intensive – a great candidate for outsourcing.

So what to do? Many organizations which have adopted one Web Analytics solution or another underutilize the tools due to a lack of trained personnel. A lack of training also leads to less then optimal implementation – leading to poor and misleading data quality. Thus, the first priority should be for internal staff to get up to speed with Web Analytics basics before committing to one solution or another.

In general, open source tools are more primitive than their commercial counterparts. Commercial tools offer more intuitive reports and add functionality such as click stream analysis and data drill down capabilities. Yet sometimes the free tools provide more detail than commercial tools. In marketing, including search engine optimization activities, referral information is very important. Some solutions are limited to providing domain level referrer information – AWStats offers page level URL referrer reporting.

To avoid a mismatch between a tool and an organization’s needs, a business should consider getting up to speed with Web Analytics before investing in a software solution. Deployment of a free tool, such as AWStats (web server log analysis) or Google Analytics (JavaScript page tags), or both, can be a good start. We have written a two part guide to getting started with AWStats for O’Reilly. Businesses can also consider retaining a vendor neutral consultant to help them in a solution selection process.

Similar Posts:

Registration is now open for the next SEO Course and Google Analytics Course in Milan. Don’t miss the opportunity!

About Sean Carlos

Sean Carlos is a digital marketing consultant & teacher, assisting companies with their Search (SEO + SEA = SEM), Social Media & Digital Media Analytics strategies. Sean first worked with text indexing in 1990 in a project for the Los Angeles County Museum of Art. Since then he worked for Hewlett-Packard Consulting and later as IT Manager of a real estate website before founding Antezeta in 2006. Sean is an official instructor of the Digital Analytics Association and collaborates with the Bocconi University. He is Chairman of the SMX Search and Social Media Conference, 12 & 13 November in Milan. He is also a co-author of the Treccani encyclopedic dictionary of computer science, ICT & digital media. Born in Providence, RI, USA, Sean received Honors in Physics from Bates College, Maine. He speaks English, Italian and German.