summaryrefslogtreecommitdiffstats
path: root/README.txt
blob: 8cdc2f1cc319d6ed0a8e5daeb08a5283feafa419 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
GENERAL INFORMATION
-------------------

This module implements the Porter-Stemmer algorithm, version 2, to improve
English-language searching with the Drupal built-in Search module. Information
about the algorithm can be found at
http://snowball.tartarus.org/algorithms/english/stemmer.html

Stemming reduces a word to its basic root or stem (e.g. 'blogging' to 'blog') so
that variations on a word ('blogs', 'blogger', 'blogging', 'blog') are
considered equivalent when searching. This generally results in more relevant
results.

Note that a few parts of the Porter Stemmer algorithm work better for American
English than British English, so some British spellings will not be stemmed
correctly.

This module will use the PECL "stem" library's implementation of the Porter
Stemmer algorithm, if it is installed on your server. If the PECL "stem" library
is not available, the module uses its own PHP implementation of the
algorithm. The output is the same in either case. More information about the
PECL "stem" library: http://pecl.php.net/package/stem


INSTALLATION
------------

See the INSTALL.txt file for installation instructions.


TESTING
-------

The Porter Stemmer module includes tests for the stemming algorithm and
functionality.  If you would like to run the tests, enable the core Testing
module, and then navigate to Administer > Configuration / Development / Testing.

Each "Stemming output" test for the Porter Stemmer module includes approximately
2000 individual word stemming tests (which test the module against a standard
word list downloaded from the site above).  Due to the way output is displayed
in SimpleTest, you may run into browser timeout or memory issues if you try to
run all 16 of the "Stemming output" tests during the same test run.

Tests are provided both for the internal algorithm and the PECL library.

There are also functional tests and tests for some of the internal steps of the
stemming algorithm.