# Purge ##### _The modular external cache invalidation framework._ The Purge module for Drupal 8 and Drupal 9 enables invalidation of content from external caches, reverse proxies and CDN platforms. The technology-agnostic plugin architecture allows for different server configurations and use cases. Last but not least, it enforces a separation of concerns and should be seen as a **middleware** solution. ##### Drush commands The ``purge_drush`` module adds the following commands for Drush administration: | Command | Alias | Description | |---------------------------------|----------|--------------------------------------------------------------| | **``cache:rebuild-external``** | ``cre`` | Invalidate 'everything' using the Purge framework. | | **``p:debug-dis``** | ``pddis``| Disable debugging for all of Purge's log channels. | | **``p:debug-en``** | ``pden`` | Enable debugging for all of Purge's log channels. | | **``p:diagnostics``** | ``pdia`` | Generate a diagnostic self-service report. | | **``p:invalidate``** | ``pinv`` | Directly invalidate an item without going through the queue. | | **``p:processor-add``** | ``pradd``| Add a new processor. | | **``p:processor-ls``** | ``prls`` | List all enabled processors. | | **``p:processor-lsa``** | ``prlsa``| List available processor plugin IDs that can be added. | | **``p:processor-rm``** | ``prrm`` | Remove a processor. | | **``p:purger-add``** | ``ppadd``| Create a new purger instance. | | **``p:purger-ls``** | ``ppls`` | List all configured purgers in order of execution. | | **``p:purger-lsa``** | ``pplsa``| List available plugin IDs for which purgers can be added. | | **``p:purger-mvd``** | ``ppmvd``| Move the given purger DOWN in the execution order. | | **``p:purger-mvu``** | ``ppmvu``| Move the given purger UP in the execution order. | | **``p:purger-rm``** | ``pprm`` | Remove a purger instance. | | **``p:queue-add``** | ``pqa`` | Add one or more items to the queue for later processing. | | **``p:queue-browse``** | ``pqb`` | Inspect what is in the queue by paging through it. | | **``p:queue-empty``** | ``pqe`` | Empty the entire queue. | | **``p:queue-stats``** | ``pqs`` | View the queue statistics. | | **``p:queue-volume``** | ``pqv`` | Count how many items are currently in the queue. | | **``p:queue-work``** | ``pqw`` | Process one or more chunks of items from the queue. | | **``p:queuer-add``** | ``puadd``| Add a new queuer. | | **``p:queuer-ls``** | ``puls`` | List all enabled queuers. | | **``p:queuer-lsa``** | ``pulsa``| List available queuer plugin IDs that can be added. | | **``p:queuer-rm``** | ``purm`` | Remove a queuer. | | **``p:types``** | ``ptyp`` | List all supported cache invalidation types. | Several commands understand the ``--format`` parameter allowing you to integrate the commands in external scripts with JSON or YAML output. See the respective ``drush help `` information for more command detail. The framework explained ------------------------------------------------------------------------------ Purge isn't just a single API but made up of several API pillars all driven by plugins, allowing very flexible end-user setups. All of them are clearly defined to enforce a sustainable and maintainable framework over the longer term. This also allows everyone to build, improve and fix bugs in only the plugins they provide and therefore allows everyone to 'scale up' solving external cache invalidation in the best way possible. #### Queuer With Purge, end users can manually invalidate a page with a Drush command or, theoretically, via a "clear this page" button in the GUI. Caches are however meant to be transparent to end users and to only be invalidated when something actually changed - and thus requires external caches to also be transparent. When editing content of any kind, Drupal will transparently and efficiently invalidate cached pages in Drupal's own **anonymous page cache**. When Drupal renders a page, it can lists all the rendered items on the page in a special HTTP response header named ``X-Drupal-Cache-Tags``. For example, this allows all cached pages with the ``node:1`` Cache-Tag in their headers to be invalidated, when that particular node (node/1) is changed. Purge ships with the **Core tags queuer**, which replicates everything Drupal core invalidated onto Purge's queue. So, when Drupal clears rendered items from its own page cache, Purge will add a _invalidation_ object to its queue so that it gets cleared remotely as well. #### Queue Queueing is an inevitable and important part of Purge as it makes cache invalidation resilient, stable and accurate. Certain reverse cache systems can clear thousands of items under a second, yet others - for instance CDNs - can demand multi-step purges that can easily take up 30 minutes. Although the queue can technically be left out of the process entirely, it will be required in the majority of use cases. ###### Statistics tracker The statistics tracker keeps track of queue activity by actively counting how many items the queue currently holds and how many have been deleted or released back to it. This data can be used to report progress on the queue and is easily retrieved, the data resets when the queue is emptied. #### Invalidations Invalidations are small value objects that **describe and track invalidations** on one or more external caching systems within the Purge pipeline. These objects float freely between **queue** and **purgers** but can also be created on the fly and in third-party code. ##### Invalidation types Purge has to be crystal clear about what needs invalidation towards its purgers, and therefore has the concept of invalidation types. Individual purgers declare which types they support and can even declare their own types when that makes sense. Since Drupal invalidates its own caches using cache tags, the ``tag`` type is the most important one to support in your architecture. * **``domain``** Invalidates an entire domain name. * **``everything``** Invalidates everything. * **``path``** Invalidates by path, e.g. ``news/article-1``. * **``regex``** Invalidates by reg. expression, e.g.: ``\.(jpg|jpeg|css|js)$``. * **``tag``** Invalidates by Drupal cache tag, e.g.: ``menu:footer``. * **``url``** Invalidates by URL, e.g. ``http://site.com/node/1``. * **``wildcardpath``** Invalidates by path, e.g. ``news/*``. * **``wildcardurl``** Invalidates by URL, e.g. ``http://site.com/node/*``. #### Purgers Purgers do all the hard work of telling external systems what to invalidate and do this in the technically required way, for instance with external API calls, through telnet commands or with specially crafted HTTP requests. Purge **doesn't ship any purger**, as this is context specific. You could for instance have multiple purgers enabled to both clean a local proxy and a CDN at the same time. ###### Capacity tracker The capacity tracker is the central orchestrator between limited system resources and a never-ending queue of cache invalidation items. The tracker actively tracks how much items are invalidated during Drupal's request lifetime and how much PHP execution time has been spent. With this information it can predict how much processing can happen during the rest of request lifetime. It is able to predict this since the capacity tracker also collects timing estimates from the actual purgers. The intelligence it has is used by the queue service and exceeding the limit isn't possible as the purgers service refuses to operate when the limits are near zero. **Runtime measurement** Purgers are required to provide timing estimates for a single invalidation, the capacity tracker operates based on this information. Runtime measurement is a feature available to purgers (most use it) which performs live time tracking of invalidation processing, and reports gathered measurements back to the capacity tracker. When a single invalidation was exceptionally slow - let's say a server was under load - the capacity for this purger drastically drops, but every faster measure collected after that will result in slow 10% upwards adjustments. Combined with the capacity tracker, this provides the best balance between performance and safety. #### Diagnostic checks External cache invalidation usually depends on many parameters, for instance configuration settings such as hostname or CDN API keys. In order to prevent hard crashes during runtime that affect end-user workflow, Purge allows plugins to write preventive diagnostic checks that can check their configurations and anything else that affects runtime execution. These checks can block all purging but also raise warnings and other diagnostic information. End-users can rely on Drupal's status report page where these checks also bubble up. #### Processors With queuers adding ``tag`` invalidation objects to the queue, this still leaves the processing of it open. Since different use cases are possible, it is up to you to configure a stable processing policy that's suitable for your use case. Possibilities: * **``cron``** claims items from the queue & purges during cron. * **``ajaxui``** AJAX-based progress bar working the queue after a piece of content has been updated. * **``lateruntime``** purges items from the queue on every request (**SLOW**). #### Tags Headers By default, no HTTP response headers with cache tags are added when you install just ``purge``. Since there is no RFC coverage for this relatively new way of cache invalidation, every module providing a **purger** is expected to define its own header and _most importantly_: unset that header too. This means that if your CDN supports it, its expected that the CDN doesn't render the tags header to end-users since you likely don't want to leak it. These plugins are very simple and relies basically only on annotation. If you need to support a reverse caching layer that isn't supported yet, the ``purge_purger_http`` project provides you with a ``Purge-Cache-Tags`` header. API examples ------------------------------------------------------------------------------ #### Queueing Adding invalidations to the queue is the simplest use case and requires a queuer object so that the queue knows who is adding the given items. ``` $purgeInvalidationFactory = \Drupal::service('purge.invalidation.factory'); $purgeQueuers = \Drupal::service('purge.queuers'); $purgeQueue = \Drupal::service('purge.queue'); $queuer = $purgeQueuers->get('myqueuer'); $invalidations = [ $purgeInvalidationFactory->get('tag', 'node:1'), $purgeInvalidationFactory->get('tag', 'node:2'), $purgeInvalidationFactory->get('path', 'contact'), $purgeInvalidationFactory->get('wildcardpath', 'news/*'), ]; $purgeQueue->add($queuer, $invalidations); ``` What happens now depends on the **processors you configured**, as some might purge very quickly after adding items to the queue whereas others might need a time-based delay before this occurs. Items enter the queue in state ``FRESH`` and normally leave the processor in the states ``SUCCEEDED``, ``FAILED``, ``PROCESSING`` or when no single plugins supported it: ``NOT_SUPPORTED``. Items that don't succeed, cycle back to the queue until it gets manually cleared. #### Invalidation without queue Processing invalidations without going through the queue is possible, but not the recommended workflow when your invalidations cannot fail. All it takes is to instantiate invalidation objects and to feed them to the purgers service. ``` use Drupal\purge\Plugin\Purge\Purger\Exception\CapacityException; use Drupal\purge\Plugin\Purge\Purger\Exception\DiagnosticsException; use Drupal\purge\Plugin\Purge\Purger\Exception\LockException; $purgeInvalidationFactory = \Drupal::service('purge.invalidation.factory'); $purgeProcessors = \Drupal::service('purge.processors'); $purgePurgers = \Drupal::service('purge.purgers'); $processor = $purgeProcessors->get('myprocessor'); $invalidations = [ $purgeInvalidationFactory->get('tag', 'node:1'), $purgeInvalidationFactory->get('tag', 'node:2'), $purgeInvalidationFactory->get('path', 'contact'), $purgeInvalidationFactory->get('wildcardpath', 'news/*'), ]; try { $purgePurgers->invalidate($processor, $invalidations); } catch (DiagnosticsException $e) { // Diagnostic exceptions happen when the system cannot purge. } catch (CapacityException $e) { // Capacity exceptions happen when too much was purged during this request. } catch (LockException $e) { // Lock exceptions happen when another code path is currently processing. } ``` When this code finished successfully, the ``$invalidations`` array holds the objects it had before, but now each object has changed its state. You can now verify this by iterating over the objects and by calling ``getState()`` or ``getStateString()`` on them (the latter is only intended for UI presentation): ``` foreach ($invalidations as $invalidation) { var_dump($invalidation->getStateString()); } ``` Which could then look like this: ``` string(6) "FAILED" string(6) "FAILED" string(9) "SUCCEEDED" string(10) "PROCESSING" ``` The results reveal why you should **normally not invalidate without going through the queue**, because items can fail or need to run again later to finish entirely. The most common use case for direct invalidation is manual UI purging. #### Queue processing Processing items from the queue is handled by processors, which users can add and configure according to their configuration. In essence, processors invoke the following code to retrieve a dynamically calculated chunk of items from the queue and feed those to the purgers service: ``` use Drupal\purge\Plugin\Purge\Purger\Exception\CapacityException; use Drupal\purge\Plugin\Purge\Purger\Exception\DiagnosticsException; use Drupal\purge\Plugin\Purge\Purger\Exception\LockException; $purgePurgers = \Drupal::service('purge.purgers'); $purgeProcessors = \Drupal::service('purge.processors'); $purgeQueue = \Drupal::service('purge.queue'); $claims = $purgeQueue->claim(); $processor = $purgeProcessors->get('myprocessor'); try { $purgePurgers->invalidate($processor, $claims); } catch (DiagnosticsException $e) { // Diagnostic exceptions happen when the system cannot purge. } catch (CapacityException $e) { // Capacity exceptions happen when too much was purged during this request. } catch (LockException $e) { // Lock exceptions happen when another code path is currently processing. } finally { $purgeQueue->handleResults($claims); } ```