1. What mod_rewrite Does Not Do:
A. Write a 'fake' URL in a browser.
B. Change anything, except the location the request is delivered from, or the location of the information delivered to the page requested.
2. What mod_rewrite Does:
A. Delivers a different result when a specific request is made.
Example:
Someone requests the page yoursite.com/stuff.html
You can use mod_rewrite to redirect the browser to any page of your choosing, say: yoursite.com/stuff.php
-or-
You can serve the information from any page of your choosing including: yoursite.com/stuff.php without changing the location in the browser.
3. what has to be, before you can use mod_rewrite:
A. Mod_Rewrite the module has to be installed and available on the server.
B. you must have AllowOverride set to FileInfo, or higher (All, etc.).
C. you must be able to follow sym links, usually Options +FollowSymLinks either in the httpd.conf (server configuration) or in the .htaccess file itself.
D. you must precede your rules/conditions with RewriteEngine on
4. When using conditions:
A. know a condition(s) will only effect the immediately-following rule.
B. know a condition(s) will only be read after a rule matches the pattern of a request. If the rule does not match, no condition will ever be checked.
5. When creating variables in rules to be re-used later, they are always designated by ().
A. Variables in a rule can be used (“back-referenced”) either in the right side of the rule (new location), or in any conditions.
B. They are gathered in the order they appear, and are retrieved by preceding the number of the variable with a $.
Example:
RewriteRule ^(var1)/no-var/(var2)$ /to-use-variables-type-$1-and-$2
Result:
to-use-variables-type-var1-and-var2
6.When creating variables in conditions to be re-used later, they are always designated by ()
A. Variables from a condition can be used (“back-referenced”) only on the right side of the rewrite rule.
B. They are gathered in the order they appear, and are retrieved by preceding the number of the variable with a %.
C. Only the variables created in the last-matched RewriteCond are available in the RewriteRule.
Example:
RewriteCond %{CONDITION_STUFF} ^(var1)/no-var/(var2)
RewriteRule ^no-var/no-var/no-var$ /to-use-variables-type-%1-and-%2
Result:
to-use-variables-type-var1-and-var2
* The exception to this is you can also use the %{CONDITION_STUFF} server varable in the rule, but it must appear exactly as it does in the condition:
Example:
RewriteCond %{CONDITION_STUFF} ^(var1)/no-var/(var2)
RewriteRule ^no-var/no-var/no-var$ /%{CONDITION_STUFF}-to-use-variables-type-%1-and-%2
7. When Using Regular Expressions and Conditions
It is much easier to create an infinite loop... be careful and test all uses before installing in the main directory.
Regular ExpressionsI guess I should start by describing a regular expression. (They aren't too scary once you get to know them.) A regular expression is basically a small piece of code that checks for patterns. The pattern can range from a single character that matches to absolutely everything.
- There are some predefined 'terms' in regular expressions to make your life easier. (At least, that are supposed to make your life easier.) Here is a short list, with what each does in the mod_rewrite setting.
[ ] enclose the expression or a portion of the expression. (Used for determining the characters, or range of characters to be matched.)
letter-letter (EG [a-z] matches any single lowercase alphabetical character in the range of a to z), so [c-e] will match any single character that is the lowercase letter c, d, or e.
- LETTER-LETTER (EG [A-Z] matches any single capital alphabetical character in the range of A to Z), so [C-E] will match any single character that is the capital letter C, D, or E.
number-number (EG [0-9] matches any single number in the range of 0 to 9), so [4-6] would match any single number 4, 5, or 6.
character list (EG [dog123] matches any single character, either d, o, g, 1, 2, or 3.
- ^ has two purposes, when used inside of [ ] it designates 'not'. (EG [^0-9] would match any character that is not 0 to 9 and [^abc] would match any character that is not a lowercase a, b, or c.) When used at the beginning of a pattern in mod_rewrite, it also designates the begining of a 'line'.
It is very important to understand and remember [dog] does not match the word 'dog', it matches any individual lowercase letter d, o, or g anywhere in the comparison. In the same way, [^dog] does not exclude the word 'dog' from matching, it excludes the lowercase letter d, o, or g from matching individually.
- To match a 'word' or a group of characters in order, you do not need to use [] so ^dog$ would match the word dog, and not d, o, or g as a single character.
- . (a dot) matches any single character, except the ending of a line.
- ? matches 0 or 1 of the characters or set of characters in brackets or parentheses immediately before it. (EG a? would match the lowercase letter 'a' 0 or 1 time, (abc)? would match the phrase 'abc' 0 or 1 time, while [a-z]? would match any lowercase letter from 'a to z' 0 or 1 time.)
- + matches 1 or more of the characters or set of characters in brackets or parentheses immediately before it. (EG a+ would match the lowercase letter 'a' 1 or more times, (abc)+ would match the phrase 'abc' 1 or more times, while [a-z]+ would match 1 or more lowercase letters from 'a to z'.)
- * matches 0 or more of the characters or set of characters immediately before it. (EG a* would match the lowercase letter 'a' 0 or more times, (abc)* would match the phrase 'abc' 0 or more times, while [a-z]* would match 0 or more lowercase letters from 'a to z'.)
- These are the basic building blocks of regular expressions as used in .htaccess and associated with mod_rewrite. By themselves, they do little, but when you put them together, they become very powerful.
- Along with regular expressions, mod_rewrite allows for the use of special characters. It's a good thing to understand what these are before you begin writing rules. (Mainly because you need one or more of them in almost every rule.)
RewriteRule tells the server to interpret the following information as a rule.
RewriteCond tells the server to interpret the following information as a condtion of the rule(s) that are immediately after it.
- ^ defines the begining of a 'line' (starting anchor). Remember, ^ also designates 'not' in a regular expression, so please don't get confused.
- ( ) creates a variable to be stored and possibly used later, and is also used to group text for use with the quantifiers ?, +, and * described above.
$ defines the ending of a 'line' (ending anchor), and when followed by a number from 1 to 9, also references a variable defined in the RewriteRule pattern (used for variables on the right side of the equation or to match a variable from the rule in a condition, see example below).
- % references a variable defined in a preceding rewrite condition. (used for variables on the right side of the equation only, see example below)
*note* - The right side of the equation is everything that follows the $ in a RewriteRule.
Examples: All variables are given a number according to the order they appear; The following rule and condition each have two variables, defined by parenthesis, so to use them you would put them where you need them in the results:
(the '-' is for spacing only to make the line more readable, and is not necessary to use variables.)
RewriteRule ^(var1)/no-var/(var2)$ /to-use-variables-type-$1-and-$2
The final result would look like this:
to-use-variables-type-var1-and-var2
RewriteCond %{CONDITION_STUFF} ^(var1)/no-var/(var2)
RewriteRule ^no-var/no-var/no-var$ /to-use-variables-type-%1-and-%2
The final result would look like this:
to-use-variables-type-var1-and-var2
To use a combination of the Condition and Rule Variables
RewriteCond %{CONDITION_STUFF} ^(var1)/no-var/(var2)
RewriteRule ^(var1)/no-var/(var2)$ /to-use-variables-type-$1-and-%2-$2
The final result would look like this:
to-use-variables-type-var1-and-var2-var2
- The exception to the above examples is, you can also use the %{CONDITION_STUFF} server variables in the right side of a rule, but it must appear exactly as in the condition:
RewriteRule ^(var1)/no-var/(var2)$ /type-%{CONDITION_STUFF}
¦ (bar) stands for 'or', normally used with alternate text or expressions grouped with parenthesis (EG (with¦without) matches the string 'with' or the string 'without'. Keep in mind that since these are inside parenthesis, the match is also stored as a variable.)
- \ is called an escaping character, this removes the function from a 'special character' (EG if you needed to match index.php?, which has both a . (dot) and a ?, you would have to 'escape' the special characters . (dot) and ? with a \ to remove their 'special' value it looks like this: index\.php\?)
! is like the ^ in a grouped regular expression and stands for Not, but can only be used at the beginning of a rule or condition, not in the middle.
- on the right side of the equation stands for No Rewrite. (It is often used in conjunction with a condition to check and see if a file or directory exists.)
Mod_Rewrite Directives for URL Redirection
Flags, in mod_rewrite are what give you the control of the response sent by the server when a specific URL is requested. They are an integral part of the rule writing process, because they designate any special instructions that might be needed. (EG If I want to tell everyone a page is moved permanently, I can add R=301 to my rule and they will know.)
Flags follow the rule and the most often used, are enclosed with [ ] (Not all flags are covered here, but the main and widely used ones are.)
- [R] stands for Redirect. The default is 302-Temporarily Moved. This can be set to any number between 300 and 400, by entering it as [R=301] or [R=YourNumberHere], but 301 (Permanently Moved) and 302 (Temporarily Moved) are the most common.
(If you just use [R] this will work, and defaults to 302-Temporarily Moved)
** Do not use this flag if you are trying to make a 'silent' redirect.
- [F] stands for Forbidden. Any URL or file that matches the rule (and condition(s) if present) will return a 403-Forbidden response to anyone who tries to access them. (Useful for files that you would like to keep private, or you do not want indexed prior to 'going live' with them.)
- [G] stands for Gone. (Similar to 404-Not Found, but it indicates that a resource was intentionally removed.) Not recommended for use unless you test the HTTP protocol level used by the client and return 410-Gone only to HTTP/1.1 or enhanced HTTP/1.0 clients. Older true HTTP/1.0 clients will treat 410-Gone as 400-Bad Request.
- [P] stands for Proxy. This creates a type of 'silent redirect' for files or pages that are not actually part of your site and can be used to serve pages from a different host, as though they were part of your site. (DO NOT mess with copyrighted material, some of us get very upset.)
- [NC] stands for No Case as applied to letters, so if you use this on a rule, MYsite.com, will match mysite.com... even though they are not the same case. (This can also be used with regular expressions, so instead of [a-zA-Z], you can use [a-z] and [NC] at the end of the rule for the same effect.)
- [QSA] stands for Query String Append. This means the 'query string' (stuff after the?) should be passed from the original URL (the one we are rewriting) to the new URL.
- [L] stands for Last rule. As soon as this flag is read, no other following rules are processed. (Every rule should contain this flag, until you know exactly what you are doing.)
- In an attempt to put together regular expressions and mod_rewrite special characters here are some examples of what they do:
Goal: to match any lowercase words, or group of letters:
Possible Matches:
lfie, page, site, or information
Expression:
[a-z]+
Explanation: [a-z] matches any single letter. + matches 1 or more of the previous character or string of characters. When you put the two together you have a regular expression that matches any single letter from a to z over and over, until it runs into a character that is not a letter.
Goal: to match any words, or groups of letters, and store them in a variable:
Possible Matches:
lfie, Page, site, or Information
Expression:
([a-z]+) [NC]
Explanation: Same as above with the addition of () and [NC]. In mod_rewrite, () creates a single variable out of the regular expression, so the word matched is now in a variable. [NC] stands for 'No Case' (from mod_rewrite) specifying that the regular expression or regular text strings match both upper and lowercase letters. With this expression you can match any single word.
Goal: to match any word, or group of letters, then any single number, and store them in separate variables:
Possible Matches:
lfie1, Page2, site6, or InforMation9
Expression:
([a-z]+)([0-9]) [NC]
Explanation: Same as above, except notice there is no + in the number expression. This way, only a single number at the end will match. The letters are placed into one variable, and the number is placed into another.
Goal: to match any word, or group of letters, then any single number, and store them in the same variable:
Possible Matches:
lfie1, Page2, site6, or InforMation9
Expression:
([a-z]+[0-9]) [NC]
Explanation: Same as above, except notice the plus is immediately following (no space) the [a-z], but before the [0-9] (again no space), so the + affects the [a-z], but not the [0-9].
Goal: to match any word, or group of letters, then any group of numbers, and store them in the same variable:
Possible Matches:
lfie11, Page2, site642, or InforMation9987653
Expression:
([a-z]+[0-9]+) [NC]
Explanation: Same as above with the addition of a + immediately following to the numerical expression to match 1 or more numbers instead of only 1.
Goal: to match any word, or group of letters, any group of numbers, and any random letters and numbers, which might or might not be mixed together:
Possible Matches:
11, gPaE, s17ite642, or 2CreateInfo4UisCool
Expression:
([a-z0-9]+) [NC]
Explanation: The change here is to the regular expression grouping. Putting a-z and 0-9 in the same grouping followed by [NC] matches any combination of letters and numbers.
Goal: to match any word, or group of letters, then a single /, then any group of numbers, and store only the numbers in a variable.
Possible Matches:
lfie/10, gPaE/1, site/642, or CreateInfoUisCool/2474890
Expression:
[a-z]+/([0-9]+) [NC]
Explanation: Using the [a-z]+ without () matches the letters as usual. By putting the / outside of any expression, the only thing that will match is the exact character of /. Then using the ([0-9]+) again, stores any group of numbers in a variable.
Goal: to match anything before the / and store it in a variable, then match anything after the / and store it in a separate variable:
Possible Matches: lfie/10.html, gP..aE/1page_two.file, si-te/642-your-site, or
CreateInfo/245390.php
Expression:
([^/]+)/(.+)
Explanation: Using two new forms of regular expressions, this is actually easier than it may seem. Making use of the ^(not) character, matches anything that is not a / and the () again saves it in a variable. Then using the same form as above, the single, exact character of / is matched. Finally, the . (dot) character is used, because it matches any single character that is not the end of a line, and when combined with the + character, matches anything up to a line break. Once again, () are used to create the variable. *Also, notice the use of a 'catch-alls' eliminates the need for the [NC] 'flag' of mod_rewrite.