The rel=”canonical” link duplicate content panacea
As many readers probably know, Google and other search engines recently announced support for a rel=”canonical” link attribute value. The new attribute value canonical (not a tag mind you, link is the html tag) can be used by website developers to specify which of essentially similar web pages is the definitive version.
A SEO problem known as duplicate content arises when websites use different URLs, generally through parameters, to provide slightly different versions of a page, such as a printer friendly version, or to support web analytics campaign tracking. In order to give search users unique choices, search engines tend to choose the “best” URL for a page, filtering out similar versions.
As Google Engineer Matt Cutts noted in his SMX West presentation, the canonical attribute for the link tag should be considered a solution of last resort. It is not a duplicate content panacea. The canonical attribute will not solve a problem of incoming links targeting multiple pages instead of concentrating their search engine love on one page. Nor will the canonical attribute solve a problem of duplicate URLs in web analytics reports. In each case the best solution will vary, from using CSS to create printer friendly pages (the URL stay the same) to using robots.txt or the X-Robots-Tag http header directives to keep search engine robots from crawling duplicate content in the first place.
Let’s play Google says…
What I actually find troubling about the Google announcement was how it was interpreted by many web professionals or perhaps aspiring web professionals.
I have the impression that every time Google makes an announcement, people turn off their critical thinking skills and blindly pronounce Google says X, so X must be good, kind of like the child’s game Simon says.
They either miss a significant caveat – in this case the point that this is a solution of last resort – or don’t think through the wider implications.
Don’t bother rewriting those complex URLs… if you’re my competition
Two other recent examples serve to illustrate a repeating pattern. Last September Google said in essence, don’t worry about using rewrite techniques to manage URL complexity, we’ve become quite good at managing it. Google’s attempt to manage URL complexity is admirable. Yet site developers should really control how URLs end up in search engine results – both which URLs and their form. In addition to potential duplicate content issues, complex URLs tend to dissuade user click-through and thus conversion.
Please use Flash indiscriminately
Another Google post which became a green light for many to indulge in self harm concerned better search engine interpretation of Adobe’s Flash format.
This quickly made the way around the web as Google sanctions Flash usage. The reality is more complicated. There are many reasons to avoid Flash and this didn’t change much with Google’s announcement. Google needs to make the best of a bad situation. That doesn’t mean you should too.
Don’t worry about Yahoo!, Microsoft Bing or Ask
Even if the common interpretation of Google’s announcements was on the mark, people also seem to forget that Google isn’t the only game in town. Well actually, in many markets they are, but developers should still make SEO decisions based on their applicability to any search engine, today and tomorrow. This means, for example, avoiding site specific settings in Google Webmaster Tools and Yahoo Site Explorer. In the case of rel=”canonical” link, Yahoo, Bing and Ask have each announced theoretical support, but when asked by Danny Sullivan at SMX West when this support will actually be in place, each was noticeably noncommittal.
Mind you, I actually don’t mind people stumbling when interpreting Google’s announcements. I might be out of a job otherwise .
- Flash is still a problem for SEO (and the web) despite Google announcement
- Search engine support of rel=”” link attributes – cheat sheet
- Keep sections of web pages out of Yahoo! with class=”robots-nocontent”
- Google’s authorship rel=”author” markup: unfairly promoting Google+?