5 Powerfully Awesome Htaccess Redirect Tricks [How To]

by Joseph McCullough August 31st, 2011 

awesome

So you already understand the importance of proper URL normalization within a site's structure. Great! … but where do you go from there? The htaccess file allows you to override certain decisions that the server makes in the background. What this means for SEOs is that we can manipulate the information the server sends back to the client in a way that search engines and users find superior.

Your Basic 301

301 is the HTTP status code for a permanent redirect. Thus, a 301 command invoked by htaccess lets search engines know that any authority/links accumulated by the old URL should be directed to the new URL forever. 301 redirects are the bread and butter of URL normalization.

The Syntax

The 301 redirect is extremely simple:

redirect 301 /relative/path/to/file.php http://www.yoursite.com/path/to/new/file.php

The Need for More Power

Redirecting pages using the standard 301 command shown above is extremely simple when you have a small number of pages to consider.

However, once the rules become more complex, such as a directory or site transfer, the redirect 301 command becomes extremely time inefficient.

Creating a redirect 301 for thousands of pages can easily takes days of boring grunt work.

The solution? The Apache Rewrite tool.

A Word on Regular Expressions

While the following examples in the rest of the article may be slightly modified to fit your needs, I highly encourage you to learn regular expressions. They aren't nearly as difficult as they seem.

For a starting point, you can check out my introduction to regular expressions guide, and then grab some more tutorials from Regular-Expressions.info.

Additionally, you can view my more advanced guide to RegEx, covering back references, quantifiers, and anchors.

A Simple Example Using the Rewrite

Before you declare any RewriteRules, you need to tell .htaccess to turn on the Rewrite Engine.

Place the following near the top of your .htaccess file: RewriteEngine On The syntax for Rewrite Rules are as follows:

ReWriteRule url_pattern file_reference [FLAGS] Let's define these

  • URL Pattern: A Regular Expression pattern that will trigger the Rewrite
  • File Reference: The file that will be displayed
  • FLAGS: Optional enhancements such as redirection or case insensitive matching

For example, let's assume your old server only let you serve up plain html files. Now that you've moved to a host that doesn't use floppy disks, you've decided to implement some php.

You had a ton of html files that are now php files, and you don't want to incur any duplicate content issues or lose the link juice those html pages had garnered.

The solution:

RewriteBase /

RewriteRule ^(.*)\.html$ $1.php [R=301, L]

So now all requests to www.yoursite.com/whatever.html get redirected to www.yoursite.com/whatever.php

Explanation:

Let's start off with the URL pattern.

^(.*)\.html$

Now section by section:

^ The caret symbol matches the start of a string. In our case, the string is the URL. Rarely do you exclude the caret; without the caret, you can introduce ambiguities.

(.*) The dot operator for regular expressions matches any character. The star quantifier following represents "0 or more instances". Thus .* together yields "any character, 0 or more times". By '0 or more', I do not imply that the character has to be repeated: .* matches the strings aaaaa and 23e2323. Any character, any number of times.

The parenthesis around the dot and * tell the engine to store (remember) these matched characters in what we call a back reference (a bit like a variable). This allows us to (re)use these characters later, which is extremely useful!

\.html$ We know that .html is an extension. However, as defined above, the dot operator stands for any character. So the pattern ".html" to a regular expression engine can be "9html" or "lhtml", anything! By using the escape character \, we tell the engine that we want to literally match a period.

The dollar sign matches the end of the string. The combination of the caret and dollar sign ensure our pattern matches the entire URL and not just a substring of the URL.

Now that we've finished the pattern, let's take a look at the file reference.

$1.php The parenthesis in the url pattern helped us capture all characters up until the .html extension. We are able to access these back references through the dollar sign.

$1 matches the first back reference, $2 matches the second, and so on.

Now files such as website.com/file.html will reference website.com/file.php

But in this example, we don't want to stop at the reference. If we left off the flags at the end, the code as is would simply display the contents of file.php though the url is still file.html. That's just duplicate content!

By placing the [R=301, L] flags at the end of the Rewrite Rule, we give the signal to actually redirect to file.php and have that reside in the URL.

Take a Deep Breath!

This might seem a bit complex at first, but after familiarizing yourself with basic regular expressions, you will easily understand rewrite rules such as this one.

More Rewrite Snippets

Force www in URLs:

RewriteCond %{HTTP_HOST} !^www\.yourwebsite\.com [NC]

RewriteRule ^(.*)$ [block]5[/block] [R=301,L]

URLs without extensions should be executed via php.

For example, website.com/work will load website.com/work.php, but the url will not have the extension (and extensionless URLs are sexy!)

RewriteRule ^$ index.php

RewriteRule ^((?!(\.|\.php)).)*$ $0.php

Redirect index.php to the root

Options +FollowSymLinks DirectoryIndex index.php

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\ HTTP/

RewriteRule ^index\.php$ http://www.yourwebsite.com/ [R=301,L]

Using Rewrite with WordPress:

Let's say you're using wordpress and you want a pretty URL to represent listing books by author on a books page. WordPress executes its own rewrite rules that routes everything back to index.php, so many of your rewrite rules will not work because you're probably trying to rewrite a URL that has already been rewritten!

The workaround is to use the pagename GET parameter of index.php.

RewriteRule ^author/(.+)$ index.php?pagename=books&author=$1

So the url will appear as www.buybooks.com/author/king, but the server will load the php file as if www.buybooks.com/index.php?pagename=books&author=king was entered. This means you have access to the values via $_GET on the books page in your theme! Pretty handy for a WP developer wanting to organize information.

The world of Regular Expressions is quite fascinating. Rewrites allow your sites to become both more organized AND more flexible!

Joseph McCullough

Joseph McCullough is the lead developer for Vert Studios, a web design company with personality based in Tyler, Texas.

Web Design and Development Blog

You May Also Like

4 Responses to “5 Powerfully Awesome Htaccess Redirect Tricks [How To]”

  1. Alex says:

    Awesome post, Joseph. Breaking down a real example helped me understand it much better. I would love to see more examples explained like this.

  2. Tom Parker says:

    Great info and power tips Joseph! My experience with .htaccess has been pretty basic so far, and some things I used by copy and paste, knowing they worked, but not knowing exactly how it worked.

    I'm bookmarking this for closer study and reference later. Thanks!

  3. Joseph, thanks for a very well-written and valuable post. I used to refer to http://www.webconfs.com/how-to-redirect-a-webpage.php, but now I'm bookmarking this post. Very clear and to the point. Thanks a bunch!

  4. [...] RegEx in combination with .htaccess to really same some time – check this post for an intro to [...]