.htaccess redirect to avoid duplicate content penalties
Last Updated on Saturday, 16 April 2011 12:17pm Written by spunky Saturday, 16 April 2011 11:34am
This might seem like an obvious one but it’s something seldom seen in newly launched sites. If your website can be accessed with yourdomain.com and www.yourdomain.com Google will see this as two completely separate sites and penalize you for having duplicated content. You must redirect your visitors to either yourdomain.com or www.yourdomain.com – not allowing both.
It is important to address this problem as early as possible since links can be created outside the scope of your website and the search engines may already have indexed your website under both addresses, this cannot be changed that easily.
The solution is simple: force a 301 redirect for all http requests that are going to the incorrect website URL.
Redirect www.yourdomain.com to yourdomain.com
RewriteEngine On
RewriteBase /
RewriteCond %{HTTPS} off
RewriteCond %{HTTP_HOST} !^yourdomain.com$ [NC]
RewriteRule ^(.*)$ http://yourdomain.com/$1 [R=301,L]
Redirect yourdomain.com to www.yourdomain.com
RewriteEngine On
RewriteBase /
RewriteCond %{HTTPS} off
RewriteCond %{HTTP_HOST} !^www.yourdomain.com$ [NC]
RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [R=301,L]
Break it down
Line 1 -2: Instructs apache to handle the current directory and enable the rewrite module.
Line 3: Specifies that this rewrite should be ignored for HTTPS protocol. (edit: Thanks Al)
Line 4: Specifies the condition for when the rule following should be triggered. For example, when the condition of HTTP_HOST (website URL) is not (specified with “!”) the correct website url respectively starting (^) and ending as a hostname ($). The final [NC] tag specifies that the hostname is not case sensitive.
Line 4: Describes the action that should be performed after the above condition is met. The first segment of the rule ^(.*)$ is an important and special one, this captures the requested url without the domain as a variable ($1). The second segment defines the target of the rewrite rule, the final destinaton for the redirect. The tags at the end will determine how the redirect is performed, in this case we are performing a permanent redirect (R=301) and this is the last rule for the rewrite (L) so a result is expected.
Should probably slip a “RewriteCond %{HTTPS} off” in there.