This plugin enables email address tokenization.
Elasticsearch Version | Plugin Version |
---|---|
2.4.6 | 2.4.6 |
2.4.5 | 2.4.5 |
2.4.4 | 2.4.4 |
2.4.3 | 2.4.3 |
2.4.1 | 2.4.1 |
2.4.0 | 2.4.0 |
2.3.5 | 2.3.5 |
2.3.4 | 2.3.4 |
2.3.3 | 2.3.3 |
2.3.0 | 2.3.0 |
2.2.2 | 2.2.2 |
2.2.1 | 2.2.1 |
2.2.0 | 2.2.0 |
2.1.1 | 2.1.1 |
2.0.0 | 2.0.0 |
1.6.x, 1.7.x | 1.0.0 |
bin/plugin install https://github.com/jlinn/elasticsearch-analysis-email/releases/download/v2.4.6/elasticsearch-analysis-email-2.4.6.zip
part
: Defaults tonull
. If leftnull
, all email address parts will be tokenized. Options arewhole
,localpart
, anddomain
.tokenize_domain
: Defaults totrue
. Iftrue
, the domain will be further tokenized using a reverse path hierarchy tokenizer with the delimiter set to.
.split_on_plus
: Defaults totrue
. Iftrue
, the localpart of the email address will be split on the first instance of+
, and both the part preceding+
and the whole localpart will be used as tokens.split_localpart
: Defaults tonull
. This parameter expects an array of strings. If provided, the localpart will be split on each of the given strings.allow_malformed
: Defaults tofalse
. Iftrue
, malformed email addresses will not be rejected, but will be indexed without tokenization.
Index settings:
{
"settings": {
"analysis": {
"tokenizer": {
"email_domain": {
"type": "email",
"part": "domain"
}
},
"analyzer": {
"email_domain": {
"tokenizer": "email_domain"
}
}
}
}
}
Perform an analysis request:
curl 'http://localhost:9200/index_name/_analyze?analyzer=email_domain&pretty' -d 'foo+bar@email.com'
{
"tokens" : [ {
"token" : "email.com",
"start_offset" : 8,
"end_offset" : 17,
"type" : "domain",
"position" : 1
}, {
"token" : "com",
"start_offset" : 14,
"end_offset" : 17,
"type" : "domain",
"position" : 2
} ]
}