summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorArto Bendiken2006-11-21 03:39:19 (GMT)
committerArto Bendiken2006-11-21 03:39:19 (GMT)
commit45b69d54b734b23631e9d07e70d0455d72ee213f (patch)
treec616b0b5d38c7655ba15cc1a79138e6f49283ef7
Imported the initial version of Boost, a module providing static page caching for Drupal.
-rw-r--r--INSTALL.txt44
-rw-r--r--README.txt137
-rw-r--r--TODO.txt33
-rw-r--r--boost.admin.inc133
-rw-r--r--boost.api.inc179
-rw-r--r--boost.css1
-rw-r--r--boost.drush52
-rw-r--r--boost.helpers.inc66
-rw-r--r--boost.info4
-rw-r--r--boost.install25
-rw-r--r--boost.js1
-rw-r--r--boost.module266
-rw-r--r--htaccess/boosted.txt105
-rw-r--r--htaccess/default.txt88
14 files changed, 1134 insertions, 0 deletions
diff --git a/INSTALL.txt b/INSTALL.txt
new file mode 100644
index 0000000..02688b4
--- /dev/null
+++ b/INSTALL.txt
@@ -0,0 +1,44 @@
+// $Id$
+
+REQUIREMENTS
+------------
+This version of Boost is designed for Drupal 4.7 running on a Unix platform.
+Drupal's clean URLs MUST be enabled and working properly.
+
+The `path' and `pathauto' modules are recommended.
+
+In order for the static files to be correctly expired, the Drupal cron job
+must be correctly setup to execute more often than, or as often as, the
+cache lifetime interval.
+
+Since the static page caching is implemented with mod_rewrite directives,
+Apache version 1.3 or 2.x with mod_rewrite enabled is required (if Drupal's
+clean URLs work for you, you're fine; if not, get them working first).
+Other web servers, such as Lighttpd, are NOT supported at present.
+
+The `drush' module is required for (optional) command line usage.
+
+INSTALLATION
+------------
+1. Go to administer >> settings and ensure that Drupal's clean URLs are
+ enabled and working properly on your site.
+
+2. Copy all the module files into a subdirectory called modules/boost/
+ under your Drupal installation directory.
+
+3. Go to administer >> modules and enable the Boost module.
+
+4. Go to administer >> settings >> boost to review and change the
+ configuration options to your liking.
+
+5. Go to administer >> settings and enable static caching.
+
+6. Log out from Drupal (or use another browser) and browse around your site
+ as the anonymous user. Ensure that static files are indeed being
+ generated into the Boost cache directory.
+
+7. IMPORTANT: replace your .htaccess file in the Drupal installation
+ directory with the file from modules/boost/htaccess/boosted.txt.
+ (If you fail to do this, static page caching will NOT work!)
+
+8. (See README.txt for information on submitting bug reports.)
diff --git a/README.txt b/README.txt
new file mode 100644
index 0000000..721a840
--- /dev/null
+++ b/README.txt
@@ -0,0 +1,137 @@
+// $Id$
+
+NOTE: this module is currently in an alpha state. Come back in a bit unless
+you're an experienced user and don't mind figuring things out on your own.
+
+DESCRIPTION
+-----------
+This module provides static page caching for Drupal 4.7, enabling a
+potentially very significant performance and scalability boost for
+heavily-trafficked Drupal sites.
+
+For an introduction, read the original blog post at:
+ http://bendiken.net/2006/05/28/static-page-caching-for-drupal
+
+FEATURES
+--------
+* Maximally fast page serving for the anonymous visitors to your Drupal
+ site, reducing web server load and boosting your site's scalability.
+* On-demand page caching (static file created after first page request).
+* Full support for multi-site Drupal installations.
+* Command line administration support (requires the drush module).
+
+INSTALLATION
+------------
+Please refer to the accompanying file INSTALL.txt for installation
+requirements and instructions.
+
+HOW IT WORKS
+------------
+Once Boost has been installed and enabled, page requests by anonymous
+visitors will be cached as static HTML pages on the server's file system.
+Periodically (when the Drupal cron job runs) stale pages (i.e. files
+exceeding the maximum cache lifetime setting) will be purged, allowing them
+to be recreated the first time that the next anonymous visitor requests that
+page again.
+
+New rewrite rules are added to the .htaccess file supplied with Drupal,
+directing the web server to try and fulfill page requests by anonymous
+visitors first and foremost from the static page cache, and to only pass the
+request through to Drupal if the requested page is not cacheable, hasn't yet
+been cached, or the cached copy is stale.
+
+FILE SYSTEM CACHE
+-----------------
+The cached files are stored (by default) in the cache/ directory under your
+Drupal installation directory. The Drupal pages' URL paths are translated
+into file system names in the following manner:
+
+ http://mysite.com/
+ => cache/mysite.com/0/index.html
+
+ http://mysite.com/about
+ => cache/mysite.com/0/about.html
+
+ http://mysite.com/about/staff
+ => cache/mysite.com/0/about/staff.html
+
+ http://mysite.com/node/42
+ => cache/mysite.com/0/node/42.html
+
+You'll note that the directory path includes the Drupal site name, enabling
+support for multi-site Drupal installations. The zero that follows, on the
+other hand, denotes the user ID the content has been cached for -- in this
+case the anonymous user (which is the default, and only, choice available
+for the time being).
+
+DISPATCH MECHANISM
+------------------
+For each incoming page request, the new Apache mod_rewrite directives in
+.htaccess will check if a cached version of the requested page should be
+served as per the following simple rules:
+
+ 1. First, we check that the HTTP request method being used is GET.
+ POST requests are not cacheable, and are passed through to Drupal.
+
+ 2. Next, we make sure that the URL doesn't contain a query string (i.e.
+ the part after the `?' character, such as `?q=cats+and+dogs'). A query
+ string implies dynamic data, and any request that contains one will
+ be passed through to Drupal. (This also allows one to easily obtain the
+ current, non-cached version of a page by simply adding a bogus query
+ string to a URL path -- very useful for testing purposes.)
+
+ 3. Since only anonymous visitors can benefit from the static page cache at
+ present, we check that the page request doesn't include a cookie that
+ is set when a user logs in to the Drupal site. If the cookie is
+ present, we simply let Drupal handle the page request dynamically.
+
+ 4. Now, for the important bit: we check whether we actually have a cached
+ HTML file for the request URL path available in the file system cache.
+ If we do, we direct the web server to serve that file directly and to
+ terminate the request immediately after; in this case, Drupal (and
+ indeed PHP) is never invoked, meaning the page request will be served
+ by the web server itself at full speed.
+
+ 5. If, however, we couldn't locate a cached version of the page, we just
+ pass the request on to Drupal, which will serve it dynamically in the
+ normal manner.
+
+IMPORTANT NOTES
+---------------
+* If your Drupal URL paths contain non-ASCII characters, you may have to
+ tweak your locate settings on the server in order to ensure the URL paths
+ get correctly translated into directory paths on the file system.
+ Non-ASCII URL paths have currently not been tested at all and feedback on
+ them would be appreciated.
+
+LIMITATIONS
+-----------
+* Only anonymous visitors will be served cached versions of pages; logged-in
+ users will get dynamic content. This may somewhat limit the usefulness of
+ this module for those community sites that require user registration and
+ login for active participation.
+* Only content of the type `text/html' will get cached at present. RSS feeds
+ and URL paths that have some other content type (e.g. set by a third-party
+ module) will be silently ignored by Boost.
+* In contrast to Drupal's built-in caching, static caching will lose any
+ additional HTTP headers set for an HTML page by a module. This is unlikely
+ to be problem except for some very specific modules and rare use cases.
+* Web server software other than Apache is not supported at the moment.
+ Adding Lighttpd support would be desirable but is not a high priority for
+ the author at present (see TODO.txt). (Note that while the LiteSpeed web
+ server has not been specifically tested by the author, it may, in fact,
+ work, since they claim to support .htaccess files and to have mod_rewrite
+ compatibility. Feedback on this would be appreciated.)
+* At the moment, Windows users are S.O.L. due to the use of symlinks and
+ Unix-specific shell commands. The author has no personal interest in
+ supporting Windows but will accept well-documented, non-detrimental
+ patches to that effect.
+
+BUG REPORTS
+-----------
+Post feature requests and bug reports to the issue tracking system at:
+ http://drupal.org/node/add/project_issue/boost
+
+CREDITS
+-------
+Developed and maintained by Arto Bendiken <http://bendiken.net/>
diff --git a/TODO.txt b/TODO.txt
new file mode 100644
index 0000000..f4b0756
--- /dev/null
+++ b/TODO.txt
@@ -0,0 +1,33 @@
+// $Id$
+
+This is a listing of known bugs, features that mostly work but are still
+somewhat in progress, features that are being considered or planned for
+implementation, and just miscellaneous far-out ideas that could, in
+principle, be implemented if one had the time and inclination to do so.
+
+(NOTE: there is no guarantee any of these items will, in fact, be
+implemented, nor should any possible scheduling indications be construed as
+promises under any circumstances. TANSTAAFL. If you absolutely need
+something implemented right now, please contact the developers to see if
+they're available for contract work, or if perhaps a modest donation could
+speed things along.)
+
+TODO: IN THE WORKS
+------------------
+* Finish up administration interface for pre-generating static files for all
+ pages on the Drupal site in one go.
+* Test interaction with other modules that also make use of ob_start().
+
+TODO: FUTURE IDEAS
+------------------
+* Add a node-specific cache lifetime setting.
+* Add admin-visible block for updating the cached copy of the current page.
+* Other web servers than Apache are not supported at the moment. This is due
+ to the way the cache dispatch is implemented using Apache mod_rewrite
+ directives in the .htaccess file. Lighttpd support would be desirable but
+ is not a high priority for the developer at present.
+* User-specific static page cache. Could conceivably be implemented based
+ on the existing user-cookie mechanism, though security would be a concern.
+* Don't delete the entire file system cache when Boost is disabled; just
+ rename the site's cache directory with the suffix `.disabled', speeding up
+ cache regeneration once the module is re-enabled?
diff --git a/boost.admin.inc b/boost.admin.inc
new file mode 100644
index 0000000..e3c9d52
--- /dev/null
+++ b/boost.admin.inc
@@ -0,0 +1,133 @@
+<?php
+// $Id$
+
+/**
+ * @file
+ * All the code for the Boost module's administrative interface.
+ */
+
+//////////////////////////////////////////////////////////////////////////////
+// DRUPAL SYSTEM SETTINGS
+
+/**
+ * Performs alterations to the system settings form before it is rendered.
+ *
+ * Called from hook_form_alter().
+ */
+function boost_system_settings_form($form = array()) {
+ $form['cache'] = array('#type' => 'hidden','#value' => CACHE_DISABLED);
+ $form['boost'] = array(
+ '#type' => 'radios',
+ '#title' => t('Static page cache'),
+ '#default_value' => variable_get('boost', CACHE_DISABLED),
+ '#options' => array(CACHE_DISABLED => t('Disabled'),
+ CACHE_ENABLED => t('Enabled')),
+ '#description' => t('Static page caching is a mechanism which stores dynamically generated web pages as HTML files in a special cache directory located under the Drupal installation directory. By caching a web page in this manner, the web server can serve it out in the fastest possible manner, without invoking PHP or Drupal at all. While this does provide a significant performance and scalability boost, you should note that it could have negative usability side-effects unless your site is targeted at an audience consisting mostly of "anonymous" visitors.'),
+ );
+ $lifetime = &$form['cache_lifetime'];
+ $lifetime['#description'] = t('On high-traffic sites it can become necessary to enforce a minimum cache lifetime. The minimum cache lifetime is the minimum amount of time that will go by before the cache is emptied and recreated. A larger minimum cache lifetime offers better performance, but users will not see new content for a longer period of time.');
+ unset($form['cache_lifetime']);
+ $form['cache_lifetime'] = &$lifetime;
+ $form['boost_file_path'] = array(
+ '#type' => 'textfield',
+ '#title' => t('Cache file path'),
+ '#default_value' => BOOST_FILE_PATH,
+ '#size' => 60,
+ '#maxlength' => 255,
+ '#required' => TRUE,
+ '#description' => t('A file system path where the cache files will be stored. This directory has to exist and be writable by Drupal. The default setting is to store the files in a directory named \'cache\' under the Drupal installation directory. If you change this, you must also change the URL rewrite rules in your web server configuration (.htaccess for Apache, lighttpd.conf for Lighttpd), or caching will not work.'),
+ );
+ return $form;
+}
+
+//////////////////////////////////////////////////////////////////////////////
+// BOOST SETTINGS
+
+/**
+ * Declares administrative settings for the Boost module.
+ *
+ * Called from hook_settings().
+ */
+function boost_settings_form($form = array()) {
+ //_boost_check_htaccess(); // TODO
+
+ $options = array(t('Cache every page except the listed pages.'), t('Cache only the listed pages.'));
+ $description = t("Enter one page per line as Drupal paths. The '*' character is a wildcard. Example paths are '%blog' for the blog page and %blog-wildcard for every personal blog. %front is the front page.", array('%blog' => theme('placeholder', 'blog'), '%blog-wildcard' => theme('placeholder', 'blog/*'), '%front' => theme('placeholder', '<front>')));
+ if (user_access('use PHP for block visibility')) {
+ $options[] = t('Cache pages for which the following PHP code returns <code>TRUE</code> (PHP-mode, experts only).');
+ $description .= t('If the PHP-mode is chosen, enter PHP code between %php. Note that executing incorrect PHP-code can severely break your Drupal site.', array('%php' => theme('placeholder', '<?php ?>')));
+ }
+ $form['cacheability'] = array(
+ '#type' => 'fieldset',
+ '#title' => t('Cacheability settings'),
+ '#collapsible' => FALSE,
+ );
+ $form['cacheability']['boost_cacheability_option'] = array(
+ '#type' => 'radios',
+ '#title' => t('Cache specific pages'),
+ '#options' => $options,
+ '#default_value' => BOOST_CACHEABILITY_OPTION,
+ );
+ $form['cacheability']['boost_cacheability_pages'] = array(
+ '#type' => 'textarea',
+ '#title' => t('Pages'),
+ '#default_value' => BOOST_CACHEABILITY_PAGES,
+ '#description' => $description,
+ );
+
+ // TODO
+ /*$form['throttle'] = array(
+ '#type' => 'fieldset',
+ '#title' => t('Throttle settings'),
+ '#collapsible' => FALSE,
+ );
+ $form['throttle']['boost_cron_limit'] = array(
+ '#type' => 'select',
+ '#title' => t('Pages to update per cron run'),
+ '#default_value' => BOOST_CRON_LIMIT,
+ '#options' => drupal_map_assoc(array(10, 20, 50, 100, 200, 500, 1000)),
+ '#description' => t('The maximum number of static pages that will be built or rebuilt in one cron run. Set this number lower if your cron is timing out or if PHP is running out of memory.'),
+ );*/
+
+ $form['advanced'] = array(
+ '#type' => 'fieldset',
+ '#title' => t('Advanced settings'),
+ '#collapsible' => TRUE,
+ '#collapsed' => TRUE,
+ );
+ $form['advanced']['boost_file_extension'] = array(
+ '#type' => 'textfield',
+ '#title' => t('Cache file extension'),
+ '#default_value' => BOOST_FILE_EXTENSION,
+ '#size' => 10,
+ '#maxlength' => 32,
+ '#required' => TRUE,
+ '#description' => t('The file extension to append to the file name of the generated cache files. Note that this setting is of no relevance to any public URLs, and it is strongly recommended to leave this as the default \'.html\' unless you know what you are doing. If you change this, you must also change the URL rewrite rules in your web server configuration (.htaccess for Apache, lighttpd.conf for Lighttpd), or caching will not work.'),
+ );
+ $form['advanced']['boost_fetch_method'] = array(
+ '#type' => 'select',
+ '#title' => t('Page fetch method'),
+ '#default_value' => BOOST_FETCH_METHOD,
+ '#options' => array('php' => t('PHP fopen() wrapper'), 'wget' => t('Wget shell command'), 'curl' => t('curl shell command')),
+ '#description' => t('The method used to retrieve the contents of the Drupal pages to be cached. The default should work in most cases.'),
+ );
+ $form['advanced']['boost_pre_process_function'] = array(
+ '#type' => 'textfield',
+ '#title' => t('Pre-process function'),
+ '#default_value' => BOOST_PRE_PROCESS_FUNCTION,
+ '#maxlength' => 255,
+ '#description' => t('The name of a PHP function used to pre-process the contents of each page before writing them out to static files. The function is called with the contents of the page passed as a string argument, and its return value is used as the data written out to the disk.'),
+ );
+ // TODO:
+ /*$form['advanced']['boost_post_update_command'] = array(
+ '#type' => 'textfield',
+ '#title' => t('Post-update shell command'),
+ '#default_value' => BOOST_POST_UPDATE_COMMAND,
+ '#maxlength' => 255,
+ '#description' => t('If you are synchronizing the generated static cache files to an external server through some means such as SFTP or rsync, you can enter a shell command to be executed following a successful cron-triggered cache update. Note that this is an advanced setting that should normally be left blank.'),
+ );*/
+
+ return $form;
+}
+
+//////////////////////////////////////////////////////////////////////////////
diff --git a/boost.api.inc b/boost.api.inc
new file mode 100644
index 0000000..2313a24
--- /dev/null
+++ b/boost.api.inc
@@ -0,0 +1,179 @@
+<?php
+// $Id$
+
+/**
+ * @file
+ * Implements the Boost API for static page caching.
+ */
+
+//////////////////////////////////////////////////////////////////////////////
+// BOOST API
+
+/**
+ * Determines whether a given Drupal page can be cached or not.
+ *
+ * To avoid potentially troublesome situations, the user login page is never
+ * cached, nor are any admin pages. At present, we also refuse to cache any
+ * RSS feeds provided by Drupal, since they would require special handling
+ * in the mod_rewrite ruleset as they shouldn't be sent out using the
+ * text/html content type.
+ */
+function boost_is_cacheable($path) {
+ $alias = drupal_get_path_alias($path);
+ $path = drupal_get_normal_path($path); // normalize path
+
+ // Never cache the basic user login/registration pages or any administration pages
+ if ($path == 'user' || preg_match('!^user/(login|register|password)!', $path) || preg_match('!^admin!', $path))
+ return FALSE;
+
+ // At present, RSS feeds are not cacheable due to content type restrictions
+ if ($path == 'rss.xml' || preg_match('!/feed$!', $path))
+ return FALSE;
+
+ // Don't cache comment reply pages
+ if (preg_match('!^comment/reply!', $path))
+ return FALSE;
+
+ // Match the user's cacheability settings against the path
+ if (BOOST_CACHEABILITY_OPTION == 2) {
+ $result = drupal_eval(BOOST_CACHEABILITY_PAGES);
+ return !empty($result);
+ }
+ $regexp = '/^('. preg_replace(array('/(\r\n?|\n)/', '/\\\\\*/', '/(^|\|)\\\\<front\\\\>($|\|)/'), array('|', '.*', '\1'. preg_quote(variable_get('site_frontpage', 'node'), '/') .'\2'), preg_quote(BOOST_CACHEABILITY_PAGES, '/')) .')$/';
+ return !(BOOST_CACHEABILITY_OPTION xor preg_match($regexp, $alias));
+}
+
+/**
+ * Deletes all static files currently in the cache.
+ */
+function boost_cache_clear_all() {
+ clearstatcache();
+ return _boost_rmdir_rf(boost_cache_directory());
+}
+
+/**
+ * Deletes all expired static files currently in the cache.
+ */
+function boost_cache_expire_all() {
+ clearstatcache();
+ _boost_rmdir_rf(boost_cache_directory(), 'boost_file_is_expired');
+ return TRUE;
+}
+
+/**
+ * Expires the static file cache for a given page, or multiple pages
+ * matching a wildcard.
+ */
+function boost_cache_expire($path, $wildcard = FALSE) {
+ // TODO: handle wildcard.
+
+ $alias = drupal_get_path_alias($path);
+ $path = drupal_get_normal_path($path); // normalize path
+
+ $filename = boost_file_path($path);
+ if (file_exists($filename))
+ @unlink($filename);
+
+ if ($alias != $path) {
+ $symlink = boost_file_path($alias);
+ if (is_link($symlink))
+ @unlink($symlink);
+ }
+
+ return TRUE;
+}
+
+/**
+ * Returns the cached contents of the specified page, if available.
+ */
+function boost_cache_get($path) {
+ $path = drupal_get_normal_path($path); // normalize path
+
+ $filename = boost_file_path($path);
+ if (file_exists($filename) && is_readable($filename))
+ return file_get_contents($filename);
+
+ return NULL;
+}
+
+
+/**
+ * Replaces the cached contents of the specified page, if stale.
+ */
+function boost_cache_set($path, $data = '') {
+ // Append the Boost footer with the current timestamp
+ $data = rtrim($data) . "\n" . str_replace('%date', date('Y-m-d H:i:s'), BOOST_BANNER);
+
+ // Execute the pre-process function if one has been defined
+ if (function_exists(BOOST_PRE_PROCESS_FUNCTION))
+ $data = call_user_func(BOOST_PRE_PROCESS_FUNCTION, $data);
+
+ $alias = drupal_get_path_alias($path);
+ $path = drupal_get_normal_path($path); // normalize path
+
+ // Create or update the static file as needed
+ $filename = boost_file_path($path);
+ _boost_mkdir_p(dirname($filename));
+ if (!file_exists($filename) || boost_file_is_expired($filename)) {
+ if (file_put_contents($filename, $data) === FALSE) {
+ watchdog('boost', t('Unable to write file: %file', array('%file' => $filename)), WATCHDOG_WARNING);
+ }
+ }
+
+ // If a URL alias is defined, create that as a symlink to the actual file
+ if ($alias != $path) {
+ $symlink = boost_file_path($alias);
+ _boost_mkdir_p(dirname($symlink));
+ if (!is_link($symlink) || realpath(readlink($symlink)) != realpath($filename)) {
+ if (file_exists($symlink))
+ @unlink($symlink);
+ if (!_boost_symlink($filename, $symlink)) {
+ watchdog('boost', t('Unable to create symlink: %link to %target', array('%link' => $symlink, '%target' => $filename)), WATCHDOG_WARNING);
+ }
+ }
+ }
+
+ return TRUE;
+}
+
+/**
+ * Returns the full directory path to the static file cache directory.
+ */
+function boost_cache_directory($user_id = 0, $host = NULL) {
+ global $user, $base_url;
+ $user_id = 0; //(!is_null($user_id) ? $user_id : BOOST_USER_ID);
+ $parts = parse_url($base_url);
+ $host = (!empty($host) ? $host : $parts['host']);
+
+ // FIXME: correctly handle Drupal subdirectory installations.
+ return implode('/', array(getcwd(), BOOST_FILE_PATH, $host, $user_id));
+}
+
+/**
+ * Returns the static file path for a Drupal page.
+ */
+function boost_file_path($path) {
+ if ($path == BOOST_FRONTPAGE)
+ $path = 'index'; // special handling for Drupal front page
+ return implode('/', array(boost_cache_directory(), $path)) . BOOST_FILE_EXTENSION;
+}
+
+/**
+ * Returns the age of a cached file, measured in seconds since it was last
+ * updated.
+ */
+function boost_file_get_age($filename) {
+ return time() - filemtime($filename);
+}
+
+/**
+ * Determines whether a cached file has expired, i.e. whether its age exceeds
+ * the maximum cache lifetime as defined by Drupal's system settings.
+ */
+function boost_file_is_expired($filename) {
+ if (is_link($filename))
+ return FALSE;
+ return boost_file_get_age($filename) > variable_get('cache_lifetime', 600);
+}
+
+//////////////////////////////////////////////////////////////////////////////
diff --git a/boost.css b/boost.css
new file mode 100644
index 0000000..a0749a0
--- /dev/null
+++ b/boost.css
@@ -0,0 +1 @@
+/* $Id$ */
diff --git a/boost.drush b/boost.drush
new file mode 100644
index 0000000..2d323ea
--- /dev/null
+++ b/boost.drush
@@ -0,0 +1,52 @@
+<?php
+// $Id$
+
+/**
+ * @file
+ * Actions for managing the static page cache provided by the Boost module.
+ */
+
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+ * Lists all files currently in the Boost static file system cache.
+ */
+function drush_boost_list() {
+ // TODO: implementation.
+}
+
+/**
+ * Expires all files, or all files matching a given path, from the static page cache.
+ */
+function drush_boost_expire($path = NULL) {
+ drush_op('boost_cache_expire', $path, TRUE);
+
+ if (DRUSH_VERBOSE) {
+ drush_print(empty($key) ? t('Boost static page cache fully cleared.') :
+ t("Boost cached pages like `%path' expired.", array('%path' => $path)));
+ }
+}
+
+/**
+ * Enables the Boost static page cache.
+ */
+function drush_boost_enable() {
+ drush_op('variable_set', 'boost', CACHE_ENABLED);
+
+ if (DRUSH_VERBOSE) {
+ drush_print(t('Boost static page cache enabled.'));
+ }
+}
+
+/**
+ * Disables the Boost static page cache.
+ */
+function drush_boost_disable() {
+ drush_op('variable_set', 'boost', CACHE_DISABLED);
+
+ if (DRUSH_VERBOSE) {
+ drush_print(t('Boost static page cache disabled.'));
+ }
+}
+
+//////////////////////////////////////////////////////////////////////////////
diff --git a/boost.helpers.inc b/boost.helpers.inc
new file mode 100644
index 0000000..80d2d16
--- /dev/null
+++ b/boost.helpers.inc
@@ -0,0 +1,66 @@
+<?php
+// $Id$
+
+/**
+ * @file
+ * Various helper functions for the Boost module, to make life a bit easier.
+ */
+
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+ * Recursive version of mkdir(), compatible with PHP4.
+ */
+function _boost_mkdir_p($pathname, $mode = 0775, $recursive = TRUE) {
+ if (is_dir($pathname)) return TRUE;
+ if ($recursive && !_boost_mkdir_p(dirname($pathname), $mode)) return FALSE;
+ if ($result = @mkdir($pathname, $mode))
+ @chmod($pathname, $mode);
+ return $result;
+}
+
+/**
+ * Recursive version of rmdir(); use with extreme caution.
+ */
+function _boost_rmdir_rf($dirname, $callback = NULL) {
+ foreach (glob($dirname . '/*', GLOB_NOSORT) as $file) {
+ if (is_dir($file)) {
+ _boost_rmdir_rf($file, $callback);
+ }
+ else if (is_file($file)) {
+ if (!$callback || (function_exists($callback) && $callback($file)))
+ @unlink($file);
+ }
+ }
+ return @rmdir($dirname);
+}
+
+/**
+ * Creates a symbolic link using a computed relative path where possible.
+ */
+function _boost_symlink($target, $link) {
+ if (!file_exists($target) || !file_exists(dirname($link)))
+ return FALSE;
+
+ $target = explode('/', realpath($target));
+ $link = explode('/', realpath($link));
+
+ // Only bother creating a relative link if the paths are in the same
+ // top-level directory; otherwise just symlink to the absolute path.
+ if ($target[1] == $link[1]) {
+ // Remove the common path prefix
+ $cwd = array();
+ while (count($target) > 0 && count($link) > 0 && reset($target) == reset($link)) {
+ $cwd[] = array_shift($target);
+ array_shift($link);
+ }
+ // Compute the required relative path
+ if (count($link) > 1)
+ $target = array_merge(array_fill(0, count($link) - 1, '..'), $target);
+ $link = array_merge($cwd, $link);
+ }
+
+ return symlink(implode('/', $target), implode('/', $link));
+}
+
+//////////////////////////////////////////////////////////////////////////////
diff --git a/boost.info b/boost.info
new file mode 100644
index 0000000..b51760a
--- /dev/null
+++ b/boost.info
@@ -0,0 +1,4 @@
+; $Id$
+name = Boost
+description = Provides a performance and scalability boost through caching Drupal pages as static HTML files.
+package = Caching
diff --git a/boost.install b/boost.install
new file mode 100644
index 0000000..3696c9f
--- /dev/null
+++ b/boost.install
@@ -0,0 +1,25 @@
+<?php
+// $Id$
+
+/**
+ * @file
+ * Handles Boost module installation and upgrade tasks.
+ */
+
+//////////////////////////////////////////////////////////////////////////////
+
+/**
+ * Implementation of hook_install(). Installs the current version of the database schema.
+ */
+function boost_install() {
+
+ // Forcibly disable Drupal's built-in SQL caching to prevent any conflicts of interest:
+ if (variable_get('cache', CACHE_DISABLED) != CACHE_DISABLED) {
+ variable_set('cache', CACHE_DISABLED);
+ drupal_set_message(t('Drupal standard page caching disabled by Boost.'));
+ }
+
+ drupal_set_message(t('Boost module successfully installed.'));
+}
+
+//////////////////////////////////////////////////////////////////////////////
diff --git a/boost.js b/boost.js
new file mode 100644
index 0000000..cfa1da3
--- /dev/null
+++ b/boost.js
@@ -0,0 +1 @@
+// $Id$
diff --git a/boost.module b/boost.module
new file mode 100644
index 0000000..0c13cb7
--- /dev/null
+++ b/boost.module
@@ -0,0 +1,266 @@
+<?php
+// $Id$
+
+/**
+ * @file
+ * Provides static page caching for Drupal.
+ */
+
+//////////////////////////////////////////////////////////////////////////////
+// BOOST SETTINGS
+
+define('BOOST_PATH', dirname(__FILE__));
+define('BOOST_FRONTPAGE', drupal_get_normal_path(variable_get('site_frontpage', 'node')));
+
+define('BOOST_ENABLED', variable_get('boost', CACHE_DISABLED));
+define('BOOST_FILE_PATH', variable_get('boost_file_path', 'cache'));
+define('BOOST_FILE_EXTENSION', variable_get('boost_file_extension', '.html'));
+define('BOOST_CACHEABILITY_OPTION', variable_get('boost_cacheability_option', 0));
+define('BOOST_CACHEABILITY_PAGES', variable_get('boost_cacheability_pages', ''));
+define('BOOST_FETCH_METHOD', variable_get('boost_fetch_method', 'php'));
+define('BOOST_PRE_PROCESS_FUNCTION', variable_get('boost_pre_process_function', ''));
+define('BOOST_POST_UPDATE_COMMAND', variable_get('boost_post_update_command', ''));
+define('BOOST_CRON_LIMIT', variable_get('boost_cron_limit', 100));
+
+// This cookie is set for all logged-in users, so that they can be excluded
+// from caching (or, in the future, get a user-specific cached page):
+define('BOOST_COOKIE', variable_get('boost_cookie', 'DRUPAL_UID'));
+
+// This line is appended to the generated static files; it is very useful
+// for troubleshooting (e.g. determining whether one got the dynamic or
+// static version):
+define('BOOST_BANNER', variable_get('boost_banner', "<!-- Page cached by Boost at %date -->\n"));
+
+// This is needed since the $user object is already destructed in _boost_ob_handler():
+define('BOOST_USER_ID', $GLOBALS['user']->uid);
+
+//////////////////////////////////////////////////////////////////////////////
+// BOOST INCLUDES
+
+require_once BOOST_PATH . '/boost.helpers.inc';
+require_once BOOST_PATH . '/boost.api.inc';
+
+//////////////////////////////////////////////////////////////////////////////
+// DRUPAL API HOOKS
+
+/**
+ * Implementation of hook_help(). Provides online user help.
+ */
+function boost_help($section) {
+ switch ($section) {
+ case 'admin/modules#name':
+ return t('boost');
+ case 'admin/modules#description':
+ return t('Provides a performance and scalability boost through caching Drupal pages as static HTML files.');
+ case 'admin/help#boost':
+ $file = drupal_get_path('module', 'boost') . '/README.txt';
+ if (file_exists($file))
+ return '<pre>' . implode("\n", array_slice(explode("\n", @file_get_contents($file)), 2)) . '</pre>';
+ break;
+ case 'admin/settings/boost':
+ return '<p>' . '</p>'; // TODO: add help text.
+ }
+}
+
+/**
+ * Implementation of hook_perm(). Defines user permissions.
+ */
+function boost_perm() {
+ return array('administer cache');
+}
+
+/**
+ * Implementation of hook_menu(). Defines menu items and page callbacks.
+ */
+function boost_menu($may_cache) {
+ $access = user_access('administer cache');
+ $items = array();
+ if ($may_cache) {
+ // TODO: define menu actions for cache administration.
+ }
+ return $items;
+}
+
+/**
+ * Implementation of hook_init(). Performs page setup tasks.
+ */
+function boost_init() {
+ // TODO: check interaction with other modules that use ob_start(); this
+ // may have to be moved to an earlier stage of the page request.
+ if (!variable_get('cache', CACHE_DISABLED) && BOOST_ENABLED) {
+ global $user;
+ if (empty($user->uid) && $_SERVER['REQUEST_METHOD'] == 'GET') {
+ if (boost_is_cacheable($_GET['q']))
+ ob_start('_boost_ob_handler');
+ }
+ }
+
+ // Executed when saving Drupal's settings:
+ if (!empty($_POST['edit']) && $_GET['q'] == 'admin/settings') {
+ // Forcibly disable Drupal's built-in SQL caching to prevent any conflicts of interest:
+ variable_set('cache', CACHE_DISABLED);
+
+ // TODO: handle 'offline' site maintenance settings.
+
+ $old = variable_get('boost', '');
+ if (!empty($_POST['edit']['boost'])) {
+ // Ensure the cache directory exists or can be created
+ file_check_directory($_POST['edit']['boost_file_path'], FILE_CREATE_DIRECTORY, 'boost_file_path');
+ }
+ else if (!empty($old)) { // the cache was previously enabled
+ if (boost_cache_expire_all())
+ drupal_set_message('Static cache files deleted.');
+ }
+ }
+}
+
+/**
+ * Implementation of hook_form_alter(). Performs alterations before a form
+ * is rendered.
+ */
+function boost_form_alter($form_id, &$form) {
+ // Alter Drupal's settings form by hiding the default cache enabled/disabled control (which will now always default to CACHE_DISABLED), and add our own control instead.
+ if ($form_id == 'system_settings_form') {
+ require_once BOOST_PATH . '/boost.admin.inc';
+ $form['cache'] = boost_system_settings_form($form['cache']);
+ }
+}
+
+/**
+ * Implementation of hook_cron(). Performs periodic actions.
+ */
+function boost_cron() {
+ if (!BOOST_ENABLED) return;
+
+ if (boost_cache_expire_all()) {
+ watchdog('boost', t('Expired stale files from static page cache.'), WATCHDOG_NOTICE);
+ }
+}
+
+/**
+ * Implementation of hook_comment(). Acts on comment modification.
+ */
+function boost_comment($comment, $op) {
+ if (!BOOST_ENABLED) return;
+
+ switch ($op) {
+ case 'insert':
+ case 'update':
+ // Expire the relevant node page from the static page cache to prevent serving stale content:
+ if (!empty($comment['nid']))
+ boost_cache_expire('node/' . $comment['nid'], TRUE);
+ break;
+ }
+}
+
+/**
+ * Implementation of hook_nodeapi(). Acts on nodes defined by other modules.
+ */
+function boost_nodeapi(&$node, $op, $teaser = NULL, $page = NULL) {
+ if (!BOOST_ENABLED) return;
+
+ switch ($op) {
+ case 'insert':
+ case 'update':
+ case 'delete':
+ // Expire all relevant node pages from the static page cache to prevent serving stale content:
+ if (!empty($node->nid))
+ boost_cache_expire('node/' . $node->nid, TRUE);
+ break;
+ }
+}
+
+/**
+ * Implementation of hook_taxonomy(). Acts on taxonomy changes.
+ */
+function boost_taxonomy($op, $type, $term = NULL) {
+ if (!BOOST_ENABLED) return;
+
+ switch ($op) {
+ case 'insert':
+ case 'update':
+ case 'delete':
+ // TODO: Expire all relevant taxonomy pages from the static page cache to prevent serving stale content.
+ break;
+ }
+}
+
+/**
+ * Implementation of hook_user(). Acts on user account actions.
+ */
+function boost_user($op, &$edit, &$account, $category = NULL) {
+ if (!BOOST_ENABLED) return;
+
+ global $user;
+ switch ($op) {
+ case 'login':
+ // Set special cookie to prevent logged-in users getting served pages from the static page cache.
+ $expires = ini_get('session.cookie_lifetime');
+ $expires = (!empty($expires) && is_numeric($expires) ? time() + (int)$expires : 0);
+ setcookie(BOOST_COOKIE, $user->uid, $expires, ini_get('session.cookie_path'), ini_get('session.cookie_domain'), ini_get('session.cookie_secure') == '1');
+ break;
+ case 'logout':
+ setcookie(BOOST_COOKIE, FALSE, time() - 86400, ini_get('session.cookie_path'), ini_get('session.cookie_domain'), ini_get('session.cookie_secure') == '1');
+ break;
+ case 'insert':
+ // TODO: create user-specific cache directory.
+ break;
+ case 'delete':
+ // Expire the relevant user page from the static page cache to prevent serving stale content:
+ if (!empty($account->uid))
+ boost_cache_expire('user/' . $account->uid);
+ // TODO: recursively delete user-specific cache directory.
+ break;
+ }
+}
+
+/**
+ * Implementation of hook_settings(). Declares administrative settings for a module.
+ *
+ * @deprecated in Drupal 5.0.
+ */
+function boost_settings() {
+ require_once BOOST_PATH . '/boost.admin.inc';
+ return boost_settings_form();
+}
+
+//////////////////////////////////////////////////////////////////////////////
+// OUTPUT BUFFERING CALLBACK
+
+/**
+ * PHP output buffering callback.
+ *
+ * NOTE: objects have already been destructed so $user is not available.
+ */
+function _boost_ob_handler($buffer) {
+ // Ensure we're in the correct working directory, since some web servers (e.g. Apache) mess this up here.
+ chdir(dirname($_SERVER['SCRIPT_FILENAME']));
+
+ // Check the currently set content type; at present we can't deal with anything else than HTML.
+ if (_boost_get_content_type() == 'text/html') {
+ boost_cache_set($_GET['q'], $buffer);
+ }
+
+ // Allow the page request to finish up normally
+ return $buffer;
+}
+
+/**
+ * Determines the MIME content type of the current page response based on
+ * the currently set Content-Type HTTP header.
+ *
+ * This should normally return the string 'text/html' unless another module
+ * has overridden the content type.
+ */
+function _boost_get_content_type($default = NULL) {
+ static $regex = '/^Content-Type:\s*([\w\d\/\-]+)/i';
+
+ // The last Content-Type header is the one that counts:
+ $headers = preg_grep($regex, explode("\n", drupal_set_header()));
+ if (!empty($headers) && preg_match($regex, array_pop($headers), $matches))
+ return $matches[1]; // found it
+
+ return $default;
+}
+
+//////////////////////////////////////////////////////////////////////////////
diff --git a/htaccess/boosted.txt b/htaccess/boosted.txt
new file mode 100644
index 0000000..262e3f3
--- /dev/null
+++ b/htaccess/boosted.txt
@@ -0,0 +1,105 @@
+#
+# Apache/PHP/Drupal settings:
+#
+
+# Protect files and directories from prying eyes.
+<FilesMatch "(\.(engine|inc|install|module|sh|.*sql|theme|tpl(\.php)?|xtmpl)|code-style\.pl|Entries.*|Repository|Root)$">
+ Order deny,allow
+ Deny from all
+</FilesMatch>
+
+# Set some options.
+Options -Indexes
+Options +FollowSymLinks
+
+# Customized error messages.
+ErrorDocument 404 /index.php
+
+# Set the default handler.
+DirectoryIndex index.php
+
+# Override PHP settings. More in sites/default/settings.php
+# but the following cannot be changed at runtime.
+
+# PHP 4, Apache 1
+<IfModule mod_php4.c>
+ php_value magic_quotes_gpc 0
+ php_value register_globals 0
+ php_value session.auto_start 0
+</IfModule>
+
+# PHP 4, Apache 2
+<IfModule sapi_apache2.c>
+ php_value magic_quotes_gpc 0
+ php_value register_globals 0
+ php_value session.auto_start 0
+</IfModule>
+
+# PHP 5, Apache 1 and 2
+<IfModule mod_php5.c>
+ php_value magic_quotes_gpc 0
+ php_value register_globals 0
+ php_value session.auto_start 0
+</IfModule>
+
+# Reduce the time dynamically generated pages are cache-able.
+<IfModule mod_expires.c>
+ ExpiresByType text/html A1
+</IfModule>
+
+# Various rewrite rules.
+<IfModule mod_rewrite.c>
+ RewriteEngine on
+
+ # If your site can be accessed both with and without the prefix www.
+ # you can use one of the following settings to force user to use only one option:
+ #
+ # If you want the site to be accessed WITH the www. only, adapt and uncomment the following:
+ # RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
+ # RewriteRule .* http://www.example.com/ [L,R=301]
+ #
+ # If you want the site to be accessed only WITHOUT the www. , adapt and uncomment the following:
+ # RewriteCond %{HTTP_HOST} !^example\.com$ [NC]
+ # RewriteRule .* http://example.com/ [L,R=301]
+
+
+ # Modify the RewriteBase if you are using Drupal in a subdirectory and
+ # the rewrite rules are not working properly.
+ #RewriteBase /drupal
+
+ # Rewrite old-style URLs of the form 'node.php?id=x'.
+ #RewriteCond %{REQUEST_FILENAME} !-f
+ #RewriteCond %{REQUEST_FILENAME} !-d
+ #RewriteCond %{QUERY_STRING} ^id=([^&]+)$
+ #RewriteRule node.php index.php?q=node/view/%1 [L]
+
+ # Rewrite old-style URLs of the form 'module.php?mod=x'.
+ #RewriteCond %{REQUEST_FILENAME} !-f
+ #RewriteCond %{REQUEST_FILENAME} !-d
+ #RewriteCond %{QUERY_STRING} ^mod=([^&]+)$
+ #RewriteRule module.php index.php?q=%1 [L]
+
+ # Rewrite rules for static page caching provided by the Boost module
+ # BOOST START
+ RewriteCond %{REQUEST_URI} !^/cache
+ RewriteCond %{REQUEST_METHOD} ^GET$
+ RewriteCond %{QUERY_STRING} ^$
+ RewriteCond %{HTTP_COOKIE} !DRUPAL_UID
+ RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/0/%{REQUEST_URI} -d
+ RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/0/%{REQUEST_URI}/index.html -f
+ RewriteRule ^(.*)$ cache/%{HTTP_HOST}/0/$1/index.html [L]
+ RewriteCond %{REQUEST_URI} !^/cache
+ RewriteCond %{REQUEST_METHOD} ^GET$
+ RewriteCond %{QUERY_STRING} ^$
+ RewriteCond %{HTTP_COOKIE} !DRUPAL_UID
+ RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/0/%{REQUEST_URI}.html -f
+ RewriteRule ^(.*)$ cache/%{HTTP_HOST}/0/$1.html [L]
+ # BOOST END
+
+ # Rewrite current-style URLs of the form 'index.php?q=x'.
+ RewriteCond %{REQUEST_FILENAME} !-f
+ RewriteCond %{REQUEST_FILENAME} !-d
+ RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
+</IfModule>
+
+# $Id$
diff --git a/htaccess/default.txt b/htaccess/default.txt
new file mode 100644
index 0000000..ce35f6d
--- /dev/null
+++ b/htaccess/default.txt
@@ -0,0 +1,88 @@
+#
+# Apache/PHP/Drupal settings:
+#
+
+# Protect files and directories from prying eyes.
+<FilesMatch "(\.(engine|inc|install|module|sh|.*sql|theme|tpl(\.php)?|xtmpl)|code-style\.pl|Entries.*|Repository|Root)$">
+ Order deny,allow
+ Deny from all
+</FilesMatch>
+
+# Set some options.
+Options -Indexes
+Options +FollowSymLinks
+
+# Customized error messages.
+ErrorDocument 404 /index.php
+
+# Set the default handler.
+DirectoryIndex index.php
+
+# Override PHP settings. More in sites/default/settings.php
+# but the following cannot be changed at runtime.
+
+# PHP 4, Apache 1
+<IfModule mod_php4.c>
+ php_value magic_quotes_gpc 0
+ php_value register_globals 0
+ php_value session.auto_start 0
+</IfModule>
+
+# PHP 4, Apache 2
+<IfModule sapi_apache2.c>
+ php_value magic_quotes_gpc 0
+ php_value register_globals 0
+ php_value session.auto_start 0
+</IfModule>
+
+# PHP 5, Apache 1 and 2
+<IfModule mod_php5.c>
+ php_value magic_quotes_gpc 0
+ php_value register_globals 0
+ php_value session.auto_start 0
+</IfModule>
+
+# Reduce the time dynamically generated pages are cache-able.
+<IfModule mod_expires.c>
+ ExpiresByType text/html A1
+</IfModule>
+
+# Various rewrite rules.
+<IfModule mod_rewrite.c>
+ RewriteEngine on
+
+ # If your site can be accessed both with and without the prefix www.
+ # you can use one of the following settings to force user to use only one option:
+ #
+ # If you want the site to be accessed WITH the www. only, adapt and uncomment the following:
+ # RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
+ # RewriteRule .* http://www.example.com/ [L,R=301]
+ #
+ # If you want the site to be accessed only WITHOUT the www. , adapt and uncomment the following:
+ # RewriteCond %{HTTP_HOST} !^example\.com$ [NC]
+ # RewriteRule .* http://example.com/ [L,R=301]
+
+
+ # Modify the RewriteBase if you are using Drupal in a subdirectory and
+ # the rewrite rules are not working properly.
+ #RewriteBase /drupal
+
+ # Rewrite old-style URLs of the form 'node.php?id=x'.
+ #RewriteCond %{REQUEST_FILENAME} !-f
+ #RewriteCond %{REQUEST_FILENAME} !-d
+ #RewriteCond %{QUERY_STRING} ^id=([^&]+)$
+ #RewriteRule node.php index.php?q=node/view/%1 [L]
+
+ # Rewrite old-style URLs of the form 'module.php?mod=x'.
+ #RewriteCond %{REQUEST_FILENAME} !-f
+ #RewriteCond %{REQUEST_FILENAME} !-d
+ #RewriteCond %{QUERY_STRING} ^mod=([^&]+)$
+ #RewriteRule module.php index.php?q=%1 [L]
+
+ # Rewrite current-style URLs of the form 'index.php?q=x'.
+ RewriteCond %{REQUEST_FILENAME} !-f
+ RewriteCond %{REQUEST_FILENAME} !-d
+ RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
+</IfModule>
+
+# $Id$