README.md 5.79 KB
Newer Older
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
1
2
3
# Hebrew analyzer for Elasticsearch

Powered by HebMorph (https://github.com/synhershko/HebMorph) and licensed under the AGPL3
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
4

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
5
![](https://travis-ci.org/synhershko/elasticsearch-analysis-hebrew.svg?branch=master) [ ![Download](https://api.bintray.com/packages/synhershko/elasticsearch-analysis-hebrew/elasticsearch-analysis-hebrew-plugin/images/download.svg) ](https://bintray.com/synhershko/elasticsearch-analysis-hebrew/elasticsearch-analysis-hebrew-plugin/_latestVersion)
6

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
7
8
## Installation

EgozyN's avatar
EgozyN committed
9
First, install the plugin by invoking the command which fits your elasticsearch version (older versions can be found at the bottom):
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
10

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
11
```
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
12
./bin/elasticsearch-plugin install https://bintray.com/synhershko/elasticsearch-analysis-hebrew/download_file?file_path=elasticsearch-analysis-hebrew-5.3.0.zip
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
13
14
15
16
```

For earlier versions (2.x and before) the installation looks a bit different:

EgozyN's avatar
EgozyN committed
17
```
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
18
./bin/plugin install https://bintray.com/synhershko/elasticsearch-analysis-hebrew/download_file?file_path=elasticsearch-analysis-hebrew-2.4.2
EgozyN's avatar
EgozyN committed
19
```
EgozyN's avatar
EgozyN committed
20

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
21
22
23
24
25
26
27
During installation, you may be prompted for additional permissions:

```
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@     WARNING: plugin requires additional permissions     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.io.FilePermission /var/lib/hebmorph/dictionary.dict read
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
28
29
30
* java.io.FilePermission /var/lib/hspell-data-files read
* java.io.FilePermission /var/lib/hspell-data-files/* read
* java.lang.RuntimePermission accessClassInPackage.sun.reflect.generics.reflectiveObjects
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
31
32
See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
33
34

Continue with installation? [y/N]y
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
35
36
37
38
```

This is normal - please confirm by typing y and hitting Enter.

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
39
40
41
## Dictionaries

This plugin uses dictionary files for it's operation. The open-source version is using hspell data files. In the 5.x versions, the dictionaries are bundled in the plugin download itself.
42

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
43
44
45
For earlier versions, you will need to obtain the Hebrew dictionary files yourself. The open-sourced hspell files can be downloaded here: https://github.com/synhershko/HebMorph/tree/master/hspell-data-files. Download the entire folder and copy it to be either in the plugin's folder (meaning, `plugins/analysis-hebrew/hspell-data-files`) or under `/var/lib/hspell-data-files`.

Elasticsearch can also be configured to load the dictionary from another folder, this is done by adding the following line to elasticsearch.yml file:
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
46

EgozyN's avatar
EgozyN committed
47
```
48
    hebrew.dict.path: /PATH/TO/HSPELL/FOLDER
EgozyN's avatar
EgozyN committed
49
```
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
50

51
52
53
54
You will also need to edit `plugin-security.policy` accordingly.

The dictionary used in by the commercial verion follows a similar pattern.

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
55
56
57
58
59
60
61
62
63
You can confirm installation by launching elasticsearch and seeing the following in the logs:

```
[2017-03-22T15:43:05,927][INFO ][c.c.e.HebrewAnalysisPlugin] Defaulting to HSpell dictionary loader
[2017-03-22T15:43:07,751][INFO ][c.c.e.HebrewAnalysisPlugin] Trying to load hspell from path plugins/analysis-hebrew/hspell-data-files/
[2017-03-22T15:43:07,751][INFO ][c.c.e.HebrewAnalysisPlugin] Dictionary 'hspell' loaded successfully from path plugins/analysis-hebrew/hspell-data-files/
```

The easiest way to make sure the plugin is installed correctly is to request `/_hebrew/check-word/בדיקה` on your server (for example: browse to http://localhost:9200/_hebrew/check-word/בדיקה). If it loads, it means everything is set up and you are good to go.
EgozyN's avatar
EgozyN committed
64

65
66
67
68
## Commercial

Hebmorph is released open-sourced, alongside with hspell dictionary files. The Commercial option will grant you further support in making Hebrew search even better, and it comes with a proprietary dictionary. For more information, check out http://code972.com/hebmorph.

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
69
## Usage
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
70

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
71
72
73
74
Use "hebrew" as analyzer name for fields containing Hebrew text

Query using "hebrew_query" or "hebrew_query_light" to enable exact matches support. "hebrew_exact" analyzer is available for query_string / match queries to be searched exact without lemma expansion.

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
75
76
Because Hebrew uses quote marks to mark acronyms, it is recommended to use the match family queries and not query_string. This is the official recommendation anyway. This plugin does not currently ship with a QueryParser implementation that can be used to power query_string queries.

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
77
78
79
Here is a sample Sense / Console syntax demonstrating usage of the analyzers in this plugin:

```
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
80
81
GET /_hebrew/check-word/בדיקה

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
PUT test-hebrew
{
    "mappings": {
        "test": {
            "properties": {
                "content": {
                    "type": "text",
                    "analyzer": "hebrew"
                }
            }
        }
    }
}

PUT test-hebrew/test/1
{
    "content": "בדיקות"
}

POST test-hebrew/_search
{
    "query": {
        "match": {
           "content": "בדיקה"
        }
    }
}
```
Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
110

EgozyN's avatar
EgozyN committed
111
112
## Older Versions

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
113
114
115
116
117
118
119
120
Elasticsearch versions 1.4.0 - 1.7.3:

```
    bin/plugin --install analysis-hebrew --url https://bintray.com/artifact/download/synhershko/elasticsearch-analysis-hebrew/elasticsearch-analysis-hebrew-1.7.zip
```

Even older versions:

EgozyN's avatar
EgozyN committed
121
~/elasticsearch-0.90.11$ bin/plugin --install analysis-hebrew --url https://bintray.com/artifact/download/synhershko/elasticsearch-analysis-hebrew/elasticsearch-analysis-hebrew-1.0.zip
EgozyN's avatar
EgozyN committed
122

EgozyN's avatar
EgozyN committed
123
~/elasticsearch-1.0.0$ bin/plugin --install analysis-hebrew --url https://bintray.com/artifact/download/synhershko/elasticsearch-analysis-hebrew/elasticsearch-analysis-hebrew-1.2.zip
EgozyN's avatar
EgozyN committed
124

EgozyN's avatar
EgozyN committed
125
~/elasticsearch-1.2.1$ bin/plugin --install analysis-hebrew --url https://bintray.com/artifact/download/synhershko/elasticsearch-analysis-hebrew/elasticsearch-analysis-hebrew-1.4.zip
EgozyN's avatar
EgozyN committed
126

EgozyN's avatar
EgozyN committed
127
~/elasticsearch-1.3.2$ bin/plugin --install analysis-hebrew --url https://bintray.com/artifact/download/synhershko/elasticsearch-analysis-hebrew/elasticsearch-analysis-hebrew-1.5.zip
EgozyN's avatar
EgozyN committed
128

Itamar Syn-Hershko's avatar
Itamar Syn-Hershko committed
129
130
131
## License

AGPL3, see LICENSE