Draft ShortURL readme 26/6/2005
Requirements: Apache server with the mod_rewrite URL Rewriting module enabled, or Microsoft IIS server with the 3rd party Asapi Rewrite filter installed.
Introduction
To make your Postnuke site more user-friendly and search-engine friendly, the URLs have to be simplified and shortened with no long query strings appended and with a more informative structure. Enabling ShortURLs in the Settings panel in PostNuke admin means URLs will be rendered in a fashion to make them look like static pages to humans and search-engines alike, making them easier to digest for both and convenient for posting links in forums and the like.
The schema for regular modules is as follows:
Module/Function-parameter1:value1-param2:value2... -paramN:valueN.(p)htm(l)
with the virtual file appearing within a virtual directory with the name of the current
module, and the parameters from the Query tagged on in pairs grouped by colons and separated by
hyphens. The extension is used for convenience to distinguish it as a virtual file, with any
one of 3 types recognised; html, htm, and phtml, the latter used when you need to distinguish
real HTML content on your site.
For instance, to view a Personal Message with id 4:
Messages/display-msgid:4.phtml
For News there's a special schema like this:
Category/Topic/ArticleXXX-title-of-story.(p)html
with the article appearing inside the virtual folders with its Category and Topic, much like we would file information, and the filename anchored by the name ArticleXXX, the XXX being the story ID, and appended with the title of the story. Having the news articles anchored by a keyword like Article is a convenient visual cue that helps with identifying it as a News item, rather than just having the Story title on its own.
The aim is to emulate the way we think and organise information, rather than
just having an index file in the root of the site with a long nonsensical
query string attached, which even the search engines can't digest or give
poor rankings to. It aims to be as simple and clear as possible while still
conveying the necessary information to the server. Some short URL schemes
simply strings together a series of virtual folders, like
component/option,com_newsfeeds/catid,5/Itemid,7/
a real news link from another system (Mambo), which is search-engine friendly,
but doesn't mean much to the user.
So if you had a news story in the category Computers and the topic Postnuke called "PostNuke Shorturls", instead of having
modules.php?op=modload&name=News&file=article&sid=123&mode=thread&order=0&thold=0
you have
Computers/Postnuke/Article123-PostNuke-Shorturls.html
This is a clear, concise and informative link that tells the user and search engine alike something about the link before going there. With the old ShortURL scheme, it was simply Article123.phtml, which while concise doesn't say much about the article. Search engines like Google do take URL keyword relevance into account.
Another example: For a Search by Author, instead of
modules.php?op=modload&name=Search&file=index&action=search&overview=1&active_stories=1&stories_author=msandersen
we have
Search/author-msandersen.html
the link is identified as being in the Search module, with the search function "author" tagged with the name of the author, making up a simple filename.
Some of the shortURLs for popular 3rd party modules like PostCalendar and PNphpBB2 has been customised, so instead of the horrendously long URL
index.php?module=PostCalendar&func=view&tplview=&viewtype=day&Date=20050405&pc_username=&pc_category=&pc_topic=&print=
which with the generic ShortURLs would be
PostCalendar/view-viewtype:day-Date:20050405.phtml
with the customised URL becomes
Calendar/05-04-2005/day.phtml
As you can see, this is not only shorter, but far easier to understand. Here the function name "view" along with the parameter names has been removed altogether as it is superfluous. That is only possible with a per-module custom filter, since we know what the function and parameter names are. This does however require matching rules in the .htaccess file.
Enabling URL Rewriting
The server, however, must be able to understand these short URLs and translate them into their proper long form.
To enable Short URLs, first of all you must either be hosted on an Apache server with the URL rewriting module (mod_rewrite) enabled or have your own Microsoft IIS server with a 3rd party rewrite filter (Asapi Rewrite). Apache on Windows has mod_rewrite enabled by default, and many Linux servers do as well, such as Red Hat Linux. You may have to ask you host provider for details, and if they will enable it if it isn't. It is unlikely an IIS server has the 3rd part filter installed, and the configuration file must be installed on the host.
For Apache, if it isn't already, rename the provided shorturl.htaccess file in your site's docs folder to .htaccess and place it in your site's root (main) folder. On Unix systems, dot-files are hidden files, so ensure you have enabled viewing of Hidden files in your file manager if you can't see it. If you already have an .htaccess file in your PostNuke root, you can combine them if they don't clash. Windows Explorer won't let you rename a file to start with a dot, as it considers it an empty filename with a long extension, so if you're hosted either upload the file first and rename it there with your FTP client or whatever is used to upload files, or open it in a simple text editor like Notepad and resave it without the prefix.
If you're hosted:
You can test what server you are hosted on with the phpinfo(); PHP function in an HTML file:
<html> <head> <title>PHP Info</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> </head> <body bgcolor="#FFFFFF" text="#000000"> <?php phpinfo();?> </body> </html>
You can upload this file as phpinfo.php to your site and access it. It will spew out a long detailed list of your PHP setup, including what server you are hosted on.
Go to 'Find' in you browser, usually under 'Edit',
and enter 'SERVER_SOFTWARE'. For hosting, it must be an Apache
server, unless you can persuade the host to install Asapi Rewrite with the
provided configuration file.
Search for 'mod_rewrite'. Sometimes it provides a list of loaded modules,
and if it's a Unix/Linux server, you may be able to tell if it has been compiled with the module.
See below on how to test if it's enabled and working.
On some server setups, where the URL path does not match the physical path, it is necessary to edit the .htaccess file in a text editor to set the RewriteBase directive near the top:
# Uncomment (remove #) and set URI of your site if needed, path from site root # eg http://www.example.com/nuke = /nuke http://www.examle.com/ = / RewriteBase /nuke
As is explained, the RewriteBase is the path from the site domain name to you site root, without a slash on the end, eg /nuke. In the case of it being installed in the root of the site, this is simply /.
If you're retrofitting an old "classic" theme to support ShortURLs as described below, you may want to have your site set to a different theme, which is a general precaution when testing new themes. You can test the new theme by entering www.yoursite.com/index.php?theme=ThemeName in the URL bar, where ThemeName is the new theme, paying attention to correct case of the name. Unix servers are very particular, and using a lowercase letter where uppercase is required will get you an error.
If you have your own Apache server with mod_rewrite:
First ensure the Mod_Rewrite URL Rewriting module is enabled in Apache. Edit Apache's main configuration file, httpd.conf, in a text editor (not word processor), and search for rewrite_module.
If you use Apache 1.3, make sure these two lines are uncommented (no hash # in front)
LoadModule rewrite_module modules/mod_rewrite.so
and a little further down,
AddModule mod_rewrite.c
Apache2 only has the first line.
The path must point to the server's modules directory, and the file mod_rewrite.so must be in there. Once you restart the server, mod_rewrite should be enabled.
It is recommended for performance that once you have confirmed the rewrite rules work in the .htaccess file as described in the next section, you move them into Apache's config file httpd.conf, as by necessity processing of .htaccess files comes a long way down the line, after all URLs have been converted to absolute file paths on the server. So to process the rewrite rules, the paths have to be retranslated into URLs, a relatively slow process.
Adding the rewrite rules to Apache's main config file:
Copy the .htaccess file to the Apache configuration
directory and rename it something like ShortURL.conf,
for instance /etc/apache/conf/ShortURL.conf or C:\Apache2\conf\ShortURL.conf
or whatever your Apache config directory is.
In an appropriate place in httpd.conf (not nested within another Directory directive), add:
############### M O D _ R E W R I T E #################### <Directory "C:/Apache2/htdocs/nuke"> RewriteEngine On RewriteRule ^$ index.php Include conf/ShortURLs.conf </Directory> RewriteLog "logs/Rewrite.log" # RewriteLogLevel from 0 to 9, 0 is off RewriteLogLevel 0
substituting the path to your Postnuke site, eg
<Directory "/var/www/vhosts/mysite/htdocs">
Note the forward slashes even for Windows paths. If the path does not begin with a slash ('/') then it is assumed to be relative to the Server Root as above. So, if the Server Root is /var/www/htdocs then the Included file here is /var/www/htdocs/conf/ShortURLs.conf
The RewriteLog directive is useful for
debugging, it sets the path for a log file logging all rewriting actions.
Another example with an absolute path:
RewriteLog "/usr/local/var/apache/logs/rewrite.log"
The higher the log level, the more detailed the log. A level of 6 usually provides plenty of detail. Always turn it off (set to 0 or comment out) when not debugging, due to the overhead in creating the logs, which quickly will become large and unwieldy if left unchecked.
The sample RewriteRule above simply rewrites www.sitename.com/ to www.sitename.com/index.php.
Please note if you have set the RewriteEngine On directive in the main httpd.conf file, you should comment it out or remove it in the new ShortURL.conf file, as well as ensuring the RewriteOptions directive isn't set:
# RewriteEngine On # RewriteOptions 'inherit'
If you're using Virtual hosts, you can set the directive
RewriteOptions 'inherit'
at the top of the virtual host section to inherit rules put in the main part of httpd.conf. Putting this directive in the main section will cause a server error, as there's nowhere to inherit from.
If you have your own IIS server:
Microsoft's IIS server doesn't have an equivalent to Apache's mod_rewrite module, but there are two 3rd party ISAPI filters available that provide cut-down versions of it. One is QwerkSoft's IIS Rewrite (www.qwerksoft.com/products/iisrewrite/), a commercial filter with Rewrite directives closely modeled on Apache's module, but with a lot of limitations.
The other is ISAPI_Rewrite (www.isapirewrite.com/), which comes in both a free "lite" version and a commercial version. The free version does not support per-server configuration or proxying, but is ideal if you only have one site. There are some important diffferences in the way its directives are written versus Apache's, for instance it operates on the full Request_Uri, including the query string, and the rules have to match the whole URL string (Match algorithm), whereas Apache htaccess rules strip the folder prefix and query string, the latter accessed with the %{QUERY_STRING} server variable.
A ShortURL configuration file is only supplied for Isapi Rewrite, httpd.ini, found in the docs folder.
Enabling Short URLs
- Firstly check the rewrite engine is working by entering some simple URLs in your browser at your site root, like www.mysite.com/FAQ/index.phtml (whatever your site's URL) or News/index.phtml. If it directs you to the FAQ or main News section respectively, it's enabled and functioning. If it doesn't, then you'll get a 404 Not Found error. Contact your host and ask if they have an Apache server with the mod_rewrite module installed, if they allow .htaccess files, and if it can be enabled for you. Otherwise, you can go ahead and enable ShortURLs for your site under Xanthia admin, or AuTheme admin for AutoThemes.
- Logged in as an Admin user, for Xanthia themes, go to Administration
-> Settings Admin panel and click the ShortURLs
tab. There, ensure Use Short URLs is selected with whichever
extension you choose from the menu, and click Save changes.
You may choose the extensions html or phtml.
The latter is convenient when you have existing HTML files on a site that
you wish to incorporate into your site with a wrapper like NukeWrapper.
AutoThemes have their own ShortURL administration setting, accessed through the AutoTheme admin panel -> extras. The extension is set by clicking the Configure link next to the ShortURL control.
For older pre-Xanthia themes (PN<0.75), which won't use Xanthia's or AutoTheme's Output Filter, some editing must be done to retrofit it. More on this later. - ShortURLs should now be enabled, except for Admin and User links to prevent
accidentally locking yourself out. Hover over some links and see if they've
been converted to ShortURLs. That means the ShortURLs are working. Try clicking
on a few links to see if you are redirected to the correct page. That means
the .htaccess rules are being processed.
If the links are converted, but clicking them gets a 404 error, referring to the ShortURL and not the full URL, then the Apache URL rewriting module isn't enabled for your site. If hosted and using the .htaccess file, the server must be set to allow per-directory overrides (allowing users to set .htaccess files), and the file must be readable by the server. If the htaccess file doesn't work, contact your host and ask if they have an Apache server with the mod_rewrite module installed, and if so do they allow .htaccess files, or can it be enabled for you.
Short URLs
A list of some of the ShortURLs produced:
(X being a number, square brackets represent a value)
[module]/index.(p)html | old-style modules like News/index.phtml |
[module]/main.(p)html | new-style modules like PostCalendar/main.phtml |
Category/Topic/ArticleXX-title-of-story.(p)html | News story article, XX is the story ID eg Computers/PostNuke/Article123-Short-URLs.html |
News/TopicXX-[topic].(p)html | List all articles of Topic XX, eg News/Topic2-PostNuke.html |
News/CategoryXX-[category].(p)html | eg News/Category3-Computers.html |
News/PrintArticleXX.(p)html | Print news article XX |
News/SendArticleXX.(p)html |
Email news article XX |
Sections/ArticleXX-pX-[title].(p)html | eg Sections/Article12-p1-Postnuke.html |
FAQ/CategoryXX-[Category]-ParentX-myfaq-[yes].(p)html | eg FAQ/Category1.html |
Search/author-[ArticleAuthor].(p)html |
Show all stories by specific author, eg Search/author-msandersen.html |
Search/topicXX.(p)html | Show all stories on a specific topic |
Search/stories-topic[Topic]-cat[Category]-[StartNumber]-[TotalStories]-[Author].phtml | Specific listing of Stories, grouped by 10 eg Search/stories-topic2-cat4-11-146.phtml |
Calendar/[day]-[month]-[year]/(day|month|year|details).(p)html | For the PostCalendar module, view date in day/month/year, or detail
view, eg Calendar/12-04-2005/day.html |
Calendar/AddEvent/[day]-[month]-[year].(p)html | PostCalendar events, where the date is the day you want to enter an
event for, eg Calendar/AddEvent/12-04-2005.html |
forum/[type]-(t [topidID] or f [forumID] or c [categoryID]).(p)html | For PNphpBB2, eg forum/index-c1.html forum/viewforum-f1.html forum/viewtopic-t1.html |
pnEncyclopedia/VolXX/termXX.(p)html | pnEncyclopedia/Vol1/term2.phtml |
Converting Older Themes for ShortURLs
Since older themes don't make use of the Xanthia templating engine or its Output Filter, it is necessary to fix them by hand. The best solution is of course to port them to a Xanthia theme, but failing that a few things can be done to fix them up:
There are two issues:
- Ensuring all links are made root-relative so that Virtual directories
can be used in the ShortURLs without breaking any links.
Since PostNuke in the past has exclusively used links relative to the site root, if you have a ShortURL like www.sitename.com/Example/index.html, then the browser would look for the links inside the nonexistent Example subfolder, and hence every link would break. - Providing an alternative ShortURL filter function, since it cannot use Xanthia's or AutoTheme's. A file called shorturls.php is provided in the Docs folder, which can be placed in the theme folder, or anywhere else convenient like the main includes folder, and once the theme has been fixed to use it, it will parse the output of the current page to convert any links that hasn't been fixed, ie older modules that don't use the PostNuke API or doesn't use root-relative image links etc.
A quick overview of older themes: They consist of a file called theme.php
which has a series of PHP functions called by the system to render the page.
The HTML is often embedded in PHP echo statements, making
them trickier to read and edit.
The structure of the theme is:
Section at the top outside any functions, setting up system variables, such as theme colours. | |
9 functions to render various parts of the theme: | |
themeheader() | Header (top) part of page, Left column and start of Center column. |
themefooter() | Center and Right columns and Footer of theme. |
themeindex() | News article index box for the main page, with a summary of the news article. |
themearticle() | The actual News article box, with full story. |
themesidebox() | The Left, Right, and Center blocks. |
OpenTable() and CloseTable() | A generic full-width container or frame for module output. |
OpenTable2() and CloseTable2() | A generic container or frame. |
These functions are now performed by templates in Xanthia themes.
For the first problem of paths, you need to fix every link
to be root-relative. Xanthia templates use a system variable to represent
the theme and image folder path, but older themes don't, and instead use a
hardcoded relative path from the site root.
For example, quoted in a PHP echo statement in an old theme:
echo "<img src=\"themes/$GLOBALS[thename]/images/pix-t.gif\" width=\"5\" height=\"1\" alt=\"\" border=\"0\">";
Note that all the double-quotes are "escaped" with a backward slash,
since they appear inside a quoted PHP string.
The paths need to be fixed to be root-relative, so near the top of the theme,
in the variable initialisation section before the themeheader function, define
$baseurl using the PostNuke API function pnGetBaseURI():
global $baseurl;
$baseurl = pnGetBaseURI();
Then add it to all the global statements in all the functions.
Then you'll have to prefix ALL the links with it followed
by a slash;
eg in the above example:
echo "<img src=\"$baseurl/themes/$GLOBALS[thename]/images/pix-t.gif\" width=\"5\" height=\"1\" alt=\"\" border=\"0\">"
In themes coded in HTML with embedded PHP, it will look like this:
<IMG src="<?PHP echo $baseurl."/themes/".$thename ?>/images/top_left2.gif" width="50" height="28" border="0">
(provided both $baseurl and $thename is made global at the top of the relevant function).
To enable the theme for ShortURLs, we'll set a ShortURL switch and include the ShortURL parser function in the variable initialisation section before the functions, start an Output Buffer (temporary storage before output) at the beginning of the themeheader function, and end it at the end of the themefooter function, sending the content of the buffer to the ShortURL parser, and outputting the modified content.
This is a simplified plan of the changes needed:
// Variable initialisation section before themeheader function: global $index, $ShortURLs, $baseurl; $thename = basename(dirname(__FILE__)); // name of theme set to theme folder name $baseurl = pnGetBaseURI(); // root-relative URL to PN site root, eg /nuke $ShortURLs = pnConfigGetVar('ShortURLsExt'); // Checks if ShortURLs are on if ($ShortURLs) // Include ShortURL parser if they're on require_once("themes/$thename/shorturls.php"); /*******************************************************/ function themeheader() { global $thename, $index, $bgcolor1, $bgcolor2, $bgcolor3, $bgcolor4, $ShortURLs, $baseurl; ...
/** All image links and local hyperlinks must be made root-relative **/ echo "<img src=\"$baseurl/themes/$thename/images/image.gif\">"; echo "<a href=\"$baseurl/index.php?name=Topics&file=index\">example link</a>";
/** Before any links to be parsed **/ if ($ShortURLs) ob_start(); // Buffering content for Short URL processing ...
echo "<img src=\"$baseurl/themes/$thename/images/image.gif\">"; ...
} /*******************************************************/ function themefooter() { global $thename, $index, $bgcolor1, $bgcolor2, $bgcolor3, $bgcolor4, $ShortURLs, $baseurl; ... /** Fix links here too **/ echo "<img src=\"$baseurl/themes/$thename/images/image.gif\">";
/** End of themefooter **/ if ($ShortURLs) { $obcontents = ob_get_contents(); // Get output buffer content and flush it ob_end_clean(); echo shorturls($obcontents); // Parse buffer content and output result } // End ShortURLs parsing
}
Acknowledgements
A copy of this document can be found here.
Based on the work of Karateka (Sascha)
http://news.postnuke.com/index.php?name=News&file=article&sid=1804
and ColdRolledSteel
http://www.mtrad.com/SimpleURL.php
See also:
http://forums.postnuke.com/index.php?name=PNphpBB2&file=viewtopic&t=10769&start=0
for instructions on converting regular PostNuke 0.72x (legacy) themes.
Martin Stær Andersen
Last updated 2005/06/26