PostNuke ShortURLs ReadMe



Draft ShortURL readme 26/6/2005

Requirements: Apache server with the mod_rewrite URL Rewriting module enabled, or Microsoft IIS server with the 3rd party Asapi Rewrite filter installed.

Introduction

To make your Postnuke site more user-friendly and search-engine friendly, the URLs have to be simplified and shortened with no long query strings appended and with a more informative structure. Enabling ShortURLs in the Settings panel in PostNuke admin means URLs will be rendered in a fashion to make them look like static pages to humans and search-engines alike, making them easier to digest for both and convenient for posting links in forums and the like.

The schema for regular modules is as follows:

Module/Function-parameter1:value1-param2:value2... -paramN:valueN.(p)htm(l)

with the virtual file appearing within a virtual directory with the name of the current module, and the parameters from the Query tagged on in pairs grouped by colons and separated by hyphens. The extension is used for convenience to distinguish it as a virtual file, with any one of 3 types recognised; html, htm, and phtml, the latter used when you need to distinguish real HTML content on your site.
For instance, to view a Personal Message with id 4:

Messages/display-msgid:4.phtml

For News there's a special schema like this:

Category/Topic/ArticleXXX-title-of-story.(p)html

with the article appearing inside the virtual folders with its Category and Topic, much like we would file information, and the filename anchored by the name ArticleXXX, the XXX being the story ID, and appended with the title of the story. Having the news articles anchored by a keyword like Article is a convenient visual cue that helps with identifying it as a News item, rather than just having the Story title on its own.

The aim is to emulate the way we think and organise information, rather than just having an index file in the root of the site with a long nonsensical query string attached, which even the search engines can't digest or give poor rankings to. It aims to be as simple and clear as possible while still conveying the necessary information to the server. Some short URL schemes simply strings together a series of virtual folders, like
component/option,com_newsfeeds/catid,5/Itemid,7/
a real news link from another system (Mambo), which is search-engine friendly, but doesn't mean much to the user.

So if you had a news story in the category Computers and the topic Postnuke called "PostNuke Shorturls", instead of having

modules.php?op=modload&name=News&file=article&sid=123&mode=thread&order=0&thold=0

you have

Computers/Postnuke/Article123-PostNuke-Shorturls.html

This is a clear, concise and informative link that tells the user and search engine alike something about the link before going there. With the old ShortURL scheme, it was simply Article123.phtml, which while concise doesn't say much about the article. Search engines like Google do take URL keyword relevance into account.

Another example: For a Search by Author, instead of

modules.php?op=modload&name=Search&file=index&action=search&overview=1&active_stories=1&stories_author=msandersen

we have

Search/author-msandersen.html

the link is identified as being in the Search module, with the search function "author" tagged with the name of the author, making up a simple filename.

Some of the shortURLs for popular 3rd party modules like PostCalendar and PNphpBB2 has been customised, so instead of the horrendously long URL

index.php?module=PostCalendar&func=view&tplview=&viewtype=day&Date=20050405&pc_username=&pc_category=&pc_topic=&print=

which with the generic ShortURLs would be

PostCalendar/view-viewtype:day-Date:20050405.phtml

with the customised URL becomes

Calendar/05-04-2005/day.phtml

As you can see, this is not only shorter, but far easier to understand. Here the function name "view" along with the parameter names has been removed altogether as it is superfluous. That is only possible with a per-module custom filter, since we know what the function and parameter names are. This does however require matching rules in the .htaccess file.

Enabling URL Rewriting

The server, however, must be able to understand these short URLs and translate them into their proper long form.

To enable Short URLs, first of all you must either be hosted on an Apache server with the URL rewriting module (mod_rewrite) enabled or have your own Microsoft IIS server with a 3rd party rewrite filter (Asapi Rewrite). Apache on Windows has mod_rewrite enabled by default, and many Linux servers do as well, such as Red Hat Linux. You may have to ask you host provider for details, and if they will enable it if it isn't. It is unlikely an IIS server has the 3rd part filter installed, and the configuration file must be installed on the host.

For Apache, if it isn't already, rename the provided shorturl.htaccess file in your site's docs folder to .htaccess and place it in your site's root (main) folder. On Unix systems, dot-files are hidden files, so ensure you have enabled viewing of Hidden files in your file manager if you can't see it. If you already have an .htaccess file in your PostNuke root, you can combine them if they don't clash. Windows Explorer won't let you rename a file to start with a dot, as it considers it an empty filename with a long extension, so if you're hosted either upload the file first and rename it there with your FTP client or whatever is used to upload files, or open it in a simple text editor like Notepad and resave it without the prefix.

If you're hosted:

You can test what server you are hosted on with the phpinfo(); PHP function in an HTML file:

<html>
  <head>
    <title>PHP Info</title>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <?php phpinfo();?>
  </body>
</html>

You can upload this file as phpinfo.php to your site and access it. It will spew out a long detailed list of your PHP setup, including what server you are hosted on.

Go to 'Find' in you browser, usually under 'Edit', and enter 'SERVER_SOFTWARE'. For hosting, it must be an Apache server, unless you can persuade the host to install Asapi Rewrite with the provided configuration file.
Search for 'mod_rewrite'. Sometimes it provides a list of loaded modules, and if it's a Unix/Linux server, you may be able to tell if it has been compiled with the module. See below on how to test if it's enabled and working.

On some server setups, where the URL path does not match the physical path, it is necessary to edit the .htaccess file in a text editor to set the RewriteBase directive near the top:

# Uncomment (remove #) and set URI of your site if needed, path from site root
# eg http://www.example.com/nuke = /nuke     http://www.examle.com/ = /
RewriteBase /nuke

As is explained, the RewriteBase is the path from the site domain name to you site root, without a slash on the end, eg /nuke. In the case of it being installed in the root of the site, this is simply /.

If you're retrofitting an old "classic" theme to support ShortURLs as described below, you may want to have your site set to a different theme, which is a general precaution when testing new themes. You can test the new theme by entering www.yoursite.com/index.php?theme=ThemeName in the URL bar, where ThemeName is the new theme, paying attention to correct case of the name. Unix servers are very particular, and using a lowercase letter where uppercase is required will get you an error.

If you have your own Apache server with mod_rewrite:

First ensure the Mod_Rewrite URL Rewriting module is enabled in Apache. Edit Apache's main configuration file, httpd.conf, in a text editor (not word processor), and search for rewrite_module.

If you use Apache 1.3, make sure these two lines are uncommented (no hash # in front)

LoadModule rewrite_module modules/mod_rewrite.so

and a little further down,

AddModule mod_rewrite.c

Apache2 only has the first line.

The path must point to the server's modules directory, and the file mod_rewrite.so must be in there. Once you restart the server, mod_rewrite should be enabled.

It is recommended for performance that once you have confirmed the rewrite rules work in the .htaccess file as described in the next section, you move them into Apache's config file httpd.conf, as by necessity processing of .htaccess files comes a long way down the line, after all URLs have been converted to absolute file paths on the server. So to process the rewrite rules, the paths have to be retranslated into URLs, a relatively slow process.

Adding the rewrite rules to Apache's main config file:

Copy the .htaccess file to the Apache configuration directory and rename it something like ShortURL.conf,
for instance /etc/apache/conf/ShortURL.conf or C:\Apache2\conf\ShortURL.conf or whatever your Apache config directory is.

In an appropriate place in httpd.conf (not nested within another Directory directive), add:

############### M O D _ R E W R I T E ####################
<Directory "C:/Apache2/htdocs/nuke">
   RewriteEngine On
   RewriteRule ^$ index.php 
   Include conf/ShortURLs.conf 
</Directory>
RewriteLog "logs/Rewrite.log"
# RewriteLogLevel from 0 to 9, 0 is off
RewriteLogLevel 0

substituting the path to your Postnuke site, eg

<Directory "/var/www/vhosts/mysite/htdocs">

Note the forward slashes even for Windows paths. If the path does not begin with a slash ('/') then it is assumed to be relative to the Server Root as above. So, if the Server Root is /var/www/htdocs then the Included file here is /var/www/htdocs/conf/ShortURLs.conf

The RewriteLog directive is useful for debugging, it sets the path for a log file logging all rewriting actions.
Another example with an absolute path:

RewriteLog "/usr/local/var/apache/logs/rewrite.log"

The higher the log level, the more detailed the log. A level of 6 usually provides plenty of detail. Always turn it off (set to 0 or comment out) when not debugging, due to the overhead in creating the logs, which quickly will become large and unwieldy if left unchecked.

The sample RewriteRule above simply rewrites www.sitename.com/ to www.sitename.com/index.php.

Please note if you have set the RewriteEngine On directive in the main httpd.conf file, you should comment it out or remove it in the new ShortURL.conf file, as well as ensuring the RewriteOptions directive isn't set:

# RewriteEngine On 
# RewriteOptions 'inherit'

If you're using Virtual hosts, you can set the directive

RewriteOptions 'inherit'

at the top of the virtual host section to inherit rules put in the main part of httpd.conf. Putting this directive in the main section will cause a server error, as there's nowhere to inherit from.

If you have your own IIS server:

Microsoft's IIS server doesn't have an equivalent to Apache's mod_rewrite module, but there are two 3rd party ISAPI filters available that provide cut-down versions of it. One is QwerkSoft's IIS Rewrite (www.qwerksoft.com/products/iisrewrite/), a commercial filter with Rewrite directives closely modeled on Apache's module, but with a lot of limitations.

The other is ISAPI_Rewrite (www.isapirewrite.com/), which comes in both a free "lite" version and a commercial version. The free version does not support per-server configuration or proxying, but is ideal if you only have one site. There are some important diffferences in the way its directives are written versus Apache's, for instance it operates on the full Request_Uri, including the query string, and the rules have to match the whole URL string (Match algorithm), whereas Apache htaccess rules strip the folder prefix and query string, the latter accessed with the %{QUERY_STRING} server variable.

A ShortURL configuration file is only supplied for Isapi Rewrite, httpd.ini, found in the docs folder.

Enabling Short URLs

  1. Firstly check the rewrite engine is working by entering some simple URLs in your browser at your site root, like www.mysite.com/FAQ/index.phtml (whatever your site's URL) or News/index.phtml. If it directs you to the FAQ or main News section respectively, it's enabled and functioning. If it doesn't, then you'll get a 404 Not Found error. Contact your host and ask if they have an Apache server with the mod_rewrite module installed, if they allow .htaccess files, and if it can be enabled for you. Otherwise, you can go ahead and enable ShortURLs for your site under Xanthia admin, or AuTheme admin for AutoThemes.
  2. Logged in as an Admin user, for Xanthia themes, go to Administration -> Settings Admin panel and click the ShortURLs tab. There, ensure Use Short URLs is selected with whichever extension you choose from the menu, and click Save changes. You may choose the extensions html or phtml. The latter is convenient when you have existing HTML files on a site that you wish to incorporate into your site with a wrapper like NukeWrapper.
    AutoThemes have their own ShortURL administration setting, accessed through the AutoTheme admin panel -> extras. The extension is set by clicking the Configure link next to the ShortURL control.
    For older pre-Xanthia themes (PN<0.75), which won't use Xanthia's or AutoTheme's Output Filter, some editing must be done to retrofit it. More on this later.
  3. ShortURLs should now be enabled, except for Admin and User links to prevent accidentally locking yourself out. Hover over some links and see if they've been converted to ShortURLs. That means the ShortURLs are working. Try clicking on a few links to see if you are redirected to the correct page. That means the .htaccess rules are being processed.

If the links are converted, but clicking them gets a 404 error, referring to the ShortURL and not the full URL, then the Apache URL rewriting module isn't enabled for your site. If hosted and using the .htaccess file, the server must be set to allow per-directory overrides (allowing users to set .htaccess files), and the file must be readable by the server. If the htaccess file doesn't work, contact your host and ask if they have an Apache server with the mod_rewrite module installed, and if so do they allow .htaccess files, or can it be enabled for you.

Short URLs

A list of some of the ShortURLs produced:

(X being a number, square brackets represent a value)

[module]/index.(p)html old-style modules like News/index.phtml
[module]/main.(p)html new-style modules like PostCalendar/main.phtml
Category/Topic/ArticleXX-title-of-story.(p)html News story article, XX is the story ID
eg Computers/PostNuke/Article123-Short-URLs.html
News/TopicXX-[topic].(p)html List all articles of Topic XX, eg News/Topic2-PostNuke.html
News/CategoryXX-[category].(p)html eg News/Category3-Computers.html
News/PrintArticleXX.(p)html Print news article XX
News/SendArticleXX.(p)html
Email news article XX
Sections/ArticleXX-pX-[title].(p)html eg Sections/Article12-p1-Postnuke.html
FAQ/CategoryXX-[Category]-ParentX-myfaq-[yes].(p)html eg FAQ/Category1.html
Search/author-[ArticleAuthor].(p)html
Show all stories by specific author,
eg Search/author-msandersen.html
Search/topicXX.(p)html Show all stories on a specific topic
Search/stories-topic[Topic]-cat[Category]-[StartNumber]-[TotalStories]-[Author].phtml Specific listing of Stories, grouped by 10
eg Search/stories-topic2-cat4-11-146.phtml
Calendar/[day]-[month]-[year]/(day|month|year|details).(p)html For the PostCalendar module, view date in day/month/year, or detail view, eg
Calendar/12-04-2005/day.html
Calendar/AddEvent/[day]-[month]-[year].(p)html PostCalendar events, where the date is the day you want to enter an event for, eg
Calendar/AddEvent/12-04-2005.html
forum/[type]-(t [topidID] or f [forumID] or c [categoryID]).(p)html For PNphpBB2, eg forum/index-c1.html
forum/viewforum-f1.html
forum/viewtopic-t1.html
pnEncyclopedia/VolXX/termXX.(p)html pnEncyclopedia/Vol1/term2.phtml

Converting Older Themes for ShortURLs

Since older themes don't make use of the Xanthia templating engine or its Output Filter, it is necessary to fix them by hand. The best solution is of course to port them to a Xanthia theme, but failing that a few things can be done to fix them up:

There are two issues:

  1. Ensuring all links are made root-relative so that Virtual directories can be used in the ShortURLs without breaking any links.
    Since PostNuke in the past has exclusively used links relative to the site root, if you have a ShortURL like www.sitename.com/Example/index.html, then the browser would look for the links inside the nonexistent Example subfolder, and hence every link would break.
  2. Providing an alternative ShortURL filter function, since it cannot use Xanthia's or AutoTheme's. A file called shorturls.php is provided in the Docs folder, which can be placed in the theme folder, or anywhere else convenient like the main includes folder, and once the theme has been fixed to use it, it will parse the output of the current page to convert any links that hasn't been fixed, ie older modules that don't use the PostNuke API or doesn't use root-relative image links etc.

A quick overview of older themes: They consist of a file called theme.php which has a series of PHP functions called by the system to render the page. The HTML is often embedded in PHP echo statements, making them trickier to read and edit.
The structure of the theme is:

Section at the top outside any functions, setting up system variables, such as theme colours.
9 functions to render various parts of the theme:
themeheader() Header (top) part of page, Left column and start of Center column.
themefooter() Center and Right columns and Footer of theme.
themeindex() News article index box for the main page, with a summary of the news article.
themearticle() The actual News article box, with full story.
themesidebox() The Left, Right, and Center blocks.
OpenTable() and CloseTable() A generic full-width container or frame for module output.
OpenTable2() and CloseTable2() A generic container or frame.

These functions are now performed by templates in Xanthia themes.

 

For the first problem of paths, you need to fix every link to be root-relative. Xanthia templates use a system variable to represent the theme and image folder path, but older themes don't, and instead use a hardcoded relative path from the site root.
For example, quoted in a PHP echo statement in an old theme:

echo "<img src=\"themes/$GLOBALS[thename]/images/pix-t.gif\" width=\"5\" height=\"1\" alt=\"\" border=\"0\">";

Note that all the double-quotes are "escaped" with a backward slash, since they appear inside a quoted PHP string.
The paths need to be fixed to be root-relative, so near the top of the theme, in the variable initialisation section before the themeheader function, define $baseurl using the PostNuke API function pnGetBaseURI():

global $baseurl;
$baseurl = pnGetBaseURI();

Then add it to all the global statements in all the functions. Then you'll have to prefix ALL the links with it followed by a slash;
eg in the above example:

echo "<img src=\"$baseurl/themes/$GLOBALS[thename]/images/pix-t.gif\" width=\"5\" height=\"1\" alt=\"\" border=\"0\">"

In themes coded in HTML with embedded PHP, it will look like this:

<IMG src="<?PHP echo $baseurl."/themes/".$thename ?>/images/top_left2.gif" width="50" height="28" border="0">

(provided both $baseurl and $thename is made global at the top of the relevant function).

 

To enable the theme for ShortURLs, we'll set a ShortURL switch and include the ShortURL parser function in the variable initialisation section before the functions, start an Output Buffer (temporary storage before output) at the beginning of the themeheader function, and end it at the end of the themefooter function, sending the content of the buffer to the ShortURL parser, and outputting the modified content.

This is a simplified plan of the changes needed:

// Variable initialisation section before themeheader function:
global $index, $ShortURLs, $baseurl;
$thename = basename(dirname(__FILE__));           // name of theme set to theme folder name
$baseurl = pnGetBaseURI();                                     // root-relative URL to PN site root, eg /nuke
$ShortURLs = pnConfigGetVar('ShortURLsExt');       // Checks if ShortURLs are on 
if ($ShortURLs)                                                         // Include ShortURL parser if they're on
        require_once("themes/$thename/shorturls.php");
/*******************************************************/
function themeheader() {
    global $thename, $index, $bgcolor1, $bgcolor2, $bgcolor3, $bgcolor4, 
               $ShortURLs, $baseurl;
...
/** All image links and local hyperlinks must be made root-relative **/ echo "<img src=\"$baseurl/themes/$thename/images/image.gif\">"; echo "<a href=\"$baseurl/index.php?name=Topics&file=index\">example link</a>";
/** Before any links to be parsed **/ if ($ShortURLs) ob_start(); // Buffering content for Short URL processing ...
echo "<img src=\"$baseurl/themes/$thename/images/image.gif\">"; ...
} /*******************************************************/ function themefooter() { global $thename, $index, $bgcolor1, $bgcolor2, $bgcolor3, $bgcolor4, $ShortURLs, $baseurl; ... /** Fix links here too **/ echo "<img src=\"$baseurl/themes/$thename/images/image.gif\">";
/** End of themefooter **/ if ($ShortURLs) { $obcontents = ob_get_contents();  // Get output buffer content and flush it ob_end_clean();       echo shorturls($obcontents);   // Parse buffer content and output result } // End ShortURLs parsing
}

Acknowledgements

A copy of this document can be found here.

Based on the work of Karateka (Sascha)

http://news.postnuke.com/index.php?name=News&file=article&sid=1804

 

and ColdRolledSteel

http://www.mtrad.com/SimpleURL.php

 

See also:

http://forums.postnuke.com/index.php?name=PNphpBB2&file=viewtopic&t=10769&start=0

for instructions on converting regular PostNuke 0.72x (legacy) themes.

 

Martin Stær Andersen
Last updated 2005/06/26