htaccess stands for HyperText Access and is a file that is supported by some web servers. It allows site owners to manually configure elements of how their site is seen or accessed and this can be done at root or directory level. The file was originally used to allow different levels of access at different directory levels of a site. Now though, these files are used for a multitude of different functions from the original access functions to the more sophisticated functions that allow URLs to be rewritten or errors to be handled.
All of the statements in the document have been tested and are correct at the time of writing. However please check your site very carefully after using them as we cannot guarantee they will be right in every scenario.
Where Should It Be Used?
Usually on most smaller sites there will only be one .htaccess file and it will be at the root of the domain. Usually this contains a set of simple commands to allow access to the site and in some cases rewrite the ways the URLs are shown to users and robots. This is usually (depending on the site and the CMS) where you would see any URL redirects too. Alternatively if a site has identified a specific IP address which has been misusing the site you can use blocking functions to stop them accessing the site. In addition the .htaccess file can be used for cache control to allow for faster page load times where there are elements of the site which can be saved locally on a users browser.
Best Practices For Updating
Updating the .htaccess file can be very easy or very tricky depending on your skill level and what you want to achieve. Although on the whole the functions in the file are simple to implement if you know what you are doing, incorrectly updating the file with even the slightest of errors can prevent your site from working correctly. Misplacing a character somewhere can break a redirect, prevent access to pages and directories or even block access to all users and robots from the site entirely.
Following a few best practice guidelines can help reduce the level of risk to your site and make it easier to continue managing the file in the best possible way.
- Backup before changes
It may seem like a very simple thing to suggest but you would be surprised how often it is overlooked. When you plan to make any changes to your file it will be much easier to restore everything back to normal with the functioning backup if an error is caused that negatively affects the sites performance.
- Comment out on the file
Over time it is likely that there will have been a few different functions or changes that will have been made and it could be easy to get lost within the file. This could inadvertently cause an error with a new change or misunderstanding from a different webmaster editing the file. As a way to help combat the potential errors or misunderstandings you should comment out before each section of the file. To do this you simply add a # at the beginning of the line and write what you intend the function to be for. As an example;
# The function below is to redirect the non-www version of the site to the
www version of the site
- Redirect function code
This will mean that the function will still work and you (or another webmaster) can see at a glance what it is in place or its intended use. You can see an example of how to redirect the site in the useful functions section of this post.
- Restrict Access
In order to better protect your site’s .htaccess file from malicious users or robots it is important to restrict the access to the file. To do this is simple and it can allow you to sleep at night knowing that there is an additional layer of security over how your site is managed at the configuration level. You can see an example of how to restrict access in the useful functions section of this post.
- Test changes
Again, another simple practice but it is always worth being as careful as possible. Most larger sites have a test version of their site and it would be best to test that any changes to the file work as intended on the test site before making them live. If you don’t have a test site it might be best to upload and check the file is working properly at a time when you know that your site is not usually accessed. You can work this out by looking in Google Analytics and check there is not a user on the site before making the change so you don’t accidentally boot a user out.
Useful Functions
Below are a few of the most used functions that can be implemented to your site to improve performance or manage access. If there are any which you use or feel are worth sharing please feel free to add them in the comments at the end of the post.
Non-www to www Redirect
This is a very common function for redirecting your site at the top level. Sometimes a site can be accessed at https://yourwebsiteurl123.com/ and https://www.yourwebsiteurl123.com/ which can cause duplication and site management issues.
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ https://www.%{HTTP_HOST}/$1 [R=301,L]
Clean up URLs to remove file extensions
If you wanted to make your URLs look a little nicer by removing the file extension from the end of it you can use the function below. in order for this to work internally to the site you should also ensure that all the linking within the site has the extension manually removed.
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
Restrict access
To add an element of security to the .htaccess file you should add the following function. It will deny access to anyone attempting to get to the file externally to view or change it. For example, if you were to add this to your file and then attempt to access it at https://www.yourwebsiteurl123.com/.htaccess you will be presented with a 403 forbidden error. This means its working!
<Files .htaccess>
order allow,deny
deny from all
</Files>
Password protect directories or pages
To do this you need to create a .htpasswd file to store the secure details of the access passwords for each user. Once this is created you would place the .htpasswd file in the appropriate location and add the function below to the .htaccess file (placed in the directory you wanted to secure) to set the security to the protected directory.
AuthType Basic
AuthName “My Protected Area”
AuthUserFile /path/to/.htpasswd
Require valid-user
To configure the .htpasswd file I recommend using a password generator to create a secure version of the password. In the example below you will see my test username ‘test’ followed by a colon and then the secure version of the password ‘testpassword’ this is how each username and password should be stored with one on each line;
test:$apr1$OnwiBYky$7BKeCkFf6fFE/HStx5D3P/
301 redirects
To redirect an old page to a new page on your (or an external) site you can use the following function. This will set the redirect up showing a 301 server status code and will ensure the appropriate page authority is passed from the old page to the new page.
Redirect 301 /oldpage.html https://www.yourwebsiteurl123.com/newpage.html
Alternatively you can write these as:
RewriteRule ^/oldpage.html$ https://www.yourwebsiteurl123.com/newpage.html [R=301,L]
URL Error handling
Sometimes people type the wrong URL in or there is an error on the site that couldn’t be resolved at the time of the server request. It is important that the generic error page is not provided to the user as sometimes these can look poor and will probably end in the user leaving the site. If possible you should create an error page or pages to handle each error. Once these are created you can add the code below into your .htaccess file substituting the location in the example for the actual location of the pages. (As a rule you should also disallow these pages in your robots.txt file.)
ErrorDocument 400 /errors/badrequest.html
ErrorDocument 401 /errors/authreqd.html
ErrorDocument 403 /errors/forbid.html
ErrorDocument 404 /errors/notfound.html
ErrorDocument 500 /errors/serverr.html
Alternatives for other servers
Not all servers have an .htaccess file, some have a console to manage this sort of thing. For example the Microsoft servers have a configuration management console called the IIS Manager and this is designed to manage asp.NET configurations. Over 50% of websites are configured on an Apache server so its quite likely that you will come across an .htaccess file sooner or later. If you want to see what alternatives there are to the Apache server and learn about their characteristics or costs you can find details here.
As I mentioned earlier, if you have any other useful .htaccess functions that you wanted to share or any additional best practices that you think should be included please feel free to add them in the comments.
Image Credit
Security concept by BigStock
Leave a Reply