htaccess, Apache and Rewrites
Posted by SEOWH Admin on 22 October 2014 03:26 PM
As a web designer or developer, it is important to know how to use the htaccess file to your advantage. It is a very powerful tool, and can even work as a deterrent for bandwidth thieves, exploits, and hackers. Below are some common examples of rules to consider when developing websites.
In order for any of these rules to work turn on the rewrite engine.
The first line of code is a security feature that must be enabled for the rewrite rules to work. In most cases, this is already set up by your host, but it won't hurt to list it again here. This is just telling the server to disable directory listings and follow symbolic links. If you are a Windows user, this would be the same as a shortcut. You may not need the first line, but it is important to understand why you might need to add it.
The second line is what actually turns the rewrite engine on. It does nothing for us though.
There are a couple of reasons why you would use a basic redirect. Let's say you just re-structured and organized your files, but you wanted visitors who were still using the old filename to be able to access the new one. They might have bookmarked the page, or found it in a search engine.
The first part of the rule is looking for a request for
There is also a caret in front of
One more thing... about that forward slash in front of
It just depends on what you find easier. The base for rewriting URLs will always be the root of your website.
There is also another way to perform redirects without using mod_rewrite. We can use mod_alias instead:
Both lines of code work, but you only need to use one. We could also do a temporary redirect:
All three lines would work, but we only need to use one. If you don't specify anything after the
What if you recently converted your site to PHP, but all of the old filenames were using the
Anytime a request for a file with the
Of course, you could always change the way Apache handled HTML files, letting them act like PHP files instead.
But file extensions are so ugly! Maybe you want to give the illusion that your individual files are actually directories:
So we could have a bunch of files in our root directory that looked like this:
And they could instead look like this:
One of the first things you should decide when you create a new site, or at least early on, is if you're going to have the 'WWW' in your domain or not. This is important not only aesthetically, but for search engine optimization.
By forcing visitors and search engines to your preferred domain, you can guarantee that you won't end up with duplicate results or different page ranks for your domain with or without the 'WWW'. I personally think domains look naked without them, and there is a reason why we use the 'WWW' in the first place.
Not everyone enters the 'WWW' when they type in a domain. Some leave it out, while others always type it in. If you're familiar with the keyboard shortcuts built in to your browser, simply typing the domain without the TLD extension (
You might have been to a site before and noticed that if you don't type the 'WWW' in, you get an error page telling you that the page cannot be found. It all depends on how your host or server administrator has decided to set this up, but usually your domain will work with or without the 'WWW'.
The first line is a rewrite condition that is looking at the current hostname, which can be something like
The second part of this line is checking to see if the hostname does not have a 'WWW' in front of it, and if it does not find one it moves to the second line and redirects all requests to the same domain with the 'WWW' in front of it. If the 'WWW' is already there, then this rule is ignored because it would not meet the condition in the first line.
A 301 code is a permanent redirect, and that is important because it also tells search engines that this isn't a temporary thing (which is what a 302 code would be).
Others prefer to remove the 'WWW', and for this rule, we're just doing the exact opposite of the previous example.
The use of a trailing slash is also important to consider. By default, your web browser will add a trailing slash to the end of a URL. It makes sense to have the trailing slash too, except for files, because it means we're in a directory.
The first line is a special variant that says if a file is requested and it exists, then we should ignore this rule and not add a trailing slash to it. Remember, file names do not have a trailing slash after them, but directories do.
The second line checks to see if a trailing slash already exists, and if it does, then it can ignore this rule too because it does not need to add one.
The last line, of course, adds a trailing slash if the previous two conditions are not met, and we are using the
There are times when you do not want the trailing slash. If you want to do something like this, we can specify which directories to not apply this rule to.
The second line is a new condition we have added to our rule. It is saying that if the requested URL contains a directory named directory, that this directory, and everything inside of it, should not receive a trailing slash.
If your host has mod_dir enabled, make sure that you turn off the directory slash, which is enabled by default. This directive will add a trailing slash at the end of a directory regardless of the rules you set up. To disable this, add this to the top of your htaccess file:
Be careful when turning off the trailing slash. Please read the following:
If you are adding a trailing slash to everything, then you don't even need to use this rewrite rule, but if you wanted to have a specific directory not receive a slash, you are going to want to add this slash rule and then this directive.
Just like with the 'WWW' example, some prefer to remove the trailing slash. It's a commonly debated question that you'll find around the Internet but it just depends on what you prefer.
Your browser and even your server, by default, add a trailing slash to a directory. It is done for a reason. If you must strip the trailing slash though this is how you would do it:
The explanation for this rule is the same as it is for when we want to add a trailing slash, just in reverse. We can also specify specific directories that we don't want apply this rule to.
Please see the note about mod_dir and the
If you've taken the time to convert your ugly URLs to pretty ones, you may not want people typing in query strings. This rule will strip any query string attached to the end of a URL.
The first line is checking to see if there is a query string. If it finds a query string, it removes it.
Sometimes a rogue question mark will remain at the end of your URL. It's certainly not pretty, and this rule will make sure you never see it again.
If you're using the previous rule to remove the query string entirely, you will want to use this rule after that one to remove the rogue question mark.
With more sites and freely available PHP software using rewritten URLs, a lot of people don't like to have any files in their URL. With the
The first line is looking at the request, and if it finds index.php, it moves to the second line, where it is stripped from the end. If you had other file extensions, like .html, we could add those to the condition as well:
We just wrap the file extensions in parenthesis and separate them by a pipe.
You might have been to a website where the URL looks something like this:
Blogs and CMS applications typically have an index file that handles all of the requests, which is why you see this. It's definitely worthless, and quite ugly. Not a problem though, we can easily remove that.
We have seen the first special variant before, but the second one is checking to see if the request contains a directory that exists. If neither of these conditions is met, we are going to process all requests to our index.php file, but without showing it in the URL.
This comes in handy if you have something like a secure order form or login area.
We can make sure people are using the secure version for these sections with the following rule. Just replace the word "directory" on the second line with the name of the directory that needs to use the secure version.
If visitors to your site are leaving a secure area and going back to other non-secure portions of your site it's a good idea to make sure that we go back to the standard HTTP protocol.
On the second line, enter the name of the directory where you do not want to send visitors to the non-secure version. It is most likely going to be the same as the directory you entered in the previous example.
If you don't want people looking at your .htaccess file, then this rule will take care of that.
If you need to enter an expiration time that isn't listed here, you can use Google to convert it to seconds for you.
There are times when you want to disable caching completely, and you would typically do this for dynamic pages that change a lot, like a blog article or forum post.
Now that we've seen examples for individual rules, here is what your htaccess file might look like: