Skip to content

Latest commit

 

History

History
1174 lines (972 loc) · 33.4 KB

README.md

File metadata and controls

1174 lines (972 loc) · 33.4 KB

Matcher Engine

The tornado_engine_matcher crate contains the core functions of the Tornado Engine. It defines the logic for parsing Rules and Filters as well as for matching Events.

The Processing Tree

The engine logic is defined by a processing tree with tree types of nodes:

  • Filter: A node that contains a filter definition and a set of child nodes
  • Iterator: A node that contains a iterator definition and a set of child nodes
  • Rule set: A leaf node that contains a set of Rules

A full example of a processing tree is:

root
  |- node_0
  |    |- rule_one
  |    \- rule_two
  |- node_1
  |    |- inner_node
  |    |    \- rule_one
  |    \- filter_two
  \- filter_one

All identifiers of the processing tree (i.e. rule names, filter names, and node names) can be composed only of letters, numbers and the "_" (underscore) character.

The configuration of the processing tree is stored on the file system in small structures composed of directories and files in json format; when the processing tree is read to be processed, the filter and rule names are automatically inferred from the filenames--excluding the json extension, and the node names from the directory names.

In the tree above, the root node is of type Filter. In fact, it contains the definition of a filter named filter_one and has two child nodes called node_0 and node_1.

When the matcher receives an Event, it will first check if it matches the filter_one condition; if it does, the matcher will proceed to evaluate its child nodes. If, instead, the filter condition does not match, the process stops and those children are ignored.

A node's children are processed independently. Thus node_0 and node_1 will be processed in isolation and each of them will be unaware of the existence and outcome of the other one. This process logic is applied recursively to every node.

In the above processing tree, node_0 is a rule set, so when the node is processed, the matcher will evaluate an Event against each rule to determine which one matches and what Actions are generated.

On the contrary, node_1 is another Filter; in this case, the matcher will check if the event verifies the filter condition in order to decide whether to process its internal nodes.

Structure of a Filter

A Filter contains these properties:

  • filter name: A string value representing a unique filter identifier. It can be composed only of letters, numbers and the "_" (underscore) character; it corresponds to the filename, stripped from its .json extension.
  • description: A string providing a high-level description of the filter.
  • active: A boolean value; if false, the filter's children will be ignored.
  • filter: A boolean operator that, when applied to an event, returns true or false. This operator determines whether an Event matches the Filter; consequently, it determines whether an Event will be processed by the filter's inner nodes.

Structure of a Rule

A Rule is composed of a set of properties, constraints and actions.

Basic Properties

  • rule name: A string value representing a unique rule identifier. It can be composed only of alphabetical characters, numbers and the "_" (underscore) character.
  • description: A string value providing a high-level description of the rule.
  • continue: A boolean value indicating whether to proceed with the event matching process if the current rule matches.
  • active: A boolean value; if false, the rule is ignored.

When the configuration is read from the file system, the rule name is automatically inferred from the filename by removing the extension and everything that precedes the first '_' (underscore) symbol. For example:

  • 0001_rule_one.json -> 0001 determines the execution order, "rule_one" is the rule name
  • 0010_rule_two.json -> 0010 determines the execution order, "rule_two" is the rule name

Constraints

The constraint section contains the tests that determine whether or not an event matches the rule. There are two types of constraints:

  • WHERE: A set of operators that when applied to an event returns true or false
  • WITH: A set of regular expressions that extract values from an Event and associate them with named variables

An event matches a rule if and only if the WHERE clause evaluates to true and all regular expressions in the WITH clause return non-empty values.

The following operators are available in the WHERE clause. Check also the examples in the remainder of this document to see how to use them.

  • 'contains': Evaluates whether the first argument contains the second one. It can be applied to strings, arrays, and maps. The operator can also be called with the alias 'contain'.
  • 'containsIgnoreCase': Evaluates whether the first argument contains, in a case-insensitive way, the string passed as second argument. This operator can also be called with the alias __'containIgnoreCase' __.
  • 'equals': Compares any two values (including, but not limited to, arrays, maps) and returns whether or not they are equal. An alias for this operator is 'equal'.
  • 'equalsIgnoreCase': Compares two strings and returns whether or not they are equal in a case-insensitive way. The operator can also be called with the alias 'equalIgnoreCase'.
  • 'ge': Compares two values and returns whether the first value is greater than or equal to the second one. If one or both of the values do not exist, it returns false.
  • 'gt': Compares two values and returns whether the first value is greater than the second one. If one or both of the values do not exist, it returns false.
  • 'le': Compares two values and returns whether the first value is less than or equal to the second one. If one or both of the values do not exist, it returns false.
  • 'lt': Compares two values and returns whether the first value is less than the second one. If one or both of the values do not exist, it returns false.
  • 'ne': This is the negation of the 'equals' operator. Compares two values and returns whether or not they are different. It can also be called with the aliases 'notEquals' and 'notEqual'.
  • 'regex': Evaluates whether a field of an event matches a given regular expression.
  • 'AND': Receives an array of operator clauses and returns true if and only if all of them evaluate to true.
  • 'OR': Receives an array of operator clauses and returns true if at least one of the operators evaluates to true.
  • 'NOT': Receives one operator clause and returns true if the operator clause evaluates to false, while it returns false if the operator clause evaluates to true.

We use the Rust Regex library (see its github project home page ) to evaluate regular expressions provided by the WITH clause and by the regex operator. You can also refer to its dedicated documentation for details about its features and limitations.

Actions

An Action is an operation triggered when an Event matches a Rule.

Reading Event Fields

A Rule can access Event fields through the "${" and "}" delimiters. To do so, the following conventions are defined:

  • The '.' (dot) char is used to access inner fields.
  • Keys containing dots are escaped with leading and trailing double quotes.
  • Double quote chars are not accepted inside a key.

For example, given the incoming event:

{
  "type": "trap",
  "created_ms": 1554130814854,
  "payload": {
    "protocol": "UDP",
    "oids": {
      "key.with.dots": "38:10:38:30.98"
    }
  }
}

The rule can access the event's fields as follows:

  • ${event.type}: Returns trap
  • ${event.payload.protocol}: Returns UDP
  • ${event.payload.oids."key.with.dots"}: Returns 38:10:38:30.98
  • ${event.payload}: Returns the entire payload
  • ${event}: Returns the entire event

String interpolation

An action payload can also contain text with placeholders that Tornado will replace at runtime. The values to be used for the substitution are extracted from the incoming Events following the conventions mentioned in the previous section; for example, using that Event definition, this string in the action payload

Received a ${event.type} with protocol ${event.payload.protocol}

produces:

Received a trap with protocol UDP

Note.

Only values of type String, Number, Boolean and null are valid. Consequently, the interpolation will fail, and the action will not be executed, if the value associated with the placeholder extracted from the Event is an Array, a Map, or undefined.

Example of Filters

Using a Filter to Create Independent Pipelines

We can use Filters to organize coherent set of Rules into isolated pipelines.

In this example we will see how to create two independent pipelines, one that receives only events with type 'email', and the other that receives only those with type 'trapd'.

Our configuration directory will look like this:

rules.d
  |- email
  |    |- ruleset
  |    |     |- ... (all rules about emails here)
  |    \- only_email_filter.json
  |- trapd
  |    |- ruleset
  |    |     |- ... (all rules about trapds here)
  |    \- only_trapd_filter.json
  \- filter_all.json

This processing tree has a root filter filter_all that matches all events. We have also defined two inner filters; the first, only_email_filter, only matches events of type 'email'. The other, only_trapd_filter, matches just events of type 'trap'.

Therefore, with this configuration, the rules defined in email/ruleset receive only email events, while those in trapd/ruleset receive only trapd events.

This configuration can be further simplified by removing the filter_all.json file:

rules.d
  |- email
  |    |- ruleset
  |    |     |- ... (all rules about emails here)
  |    \- only_email_filter.json
  \- trapd
       |- ruleset
       |     |- ... (all rules about trapds here)
       \- only_trapd_filter.json

In this case, in fact, Tornado will generate an implicit filter for the root node and the runtime behavior will not change.

Below is the content of our JSON filter files.

Content of filter_all.json (if provided):

{
  "description": "This filter allows every event",
  "active": true
}

Content of only_email_filter.json:

{
  "description": "This filter allows events of type 'email'",
  "active": true,
  "filter": {
    "type": "equals",
    "first": "${event.type}",
    "second": "email"
  }
}

Content of only_trapd_filter.json:

{
  "description": "This filter allows events of type 'trapd'",
  "active": true,
  "filter": {
    "type": "equals",
    "first": "${event.type}",
    "second": "trapd"
  }
}

Examples of Rules and operators

The 'contains' Operator

The contains operator is used to check whether the first argument contains the second one.

It applies in three different situations:

  • The arguments are both strings: Returns true if the second string is a substring of the first one.
  • The first argument is an array: Returns true if the second argument is contained in the array.
  • The first argument is a map and the second is a string: Returns true if the second argument is an existing key in the map.

In any other case, it will return false.

Rule example:

{
  "description": "",
  "continue": true,
  "active": true,
  "constraint": {
    "WHERE": {
      "type": "contains",
      "first": "${event.payload.hostname}",
      "second": "linux"
    },
    "WITH": {}
  },
  "actions": []
}

An event matches this rule if in its payload appears an entry with key hostname and whose value is a string that contains linux.

A matching Event is:

{
  "type": "trap",
  "created_ms": 1554130814854,
  "payload": {
    "hostname": "linux-server-01"
  }
}

The 'containsIgnoreCase' Operator

The containsIgnoreCase operator is used to check whether the first argument contains the string passed as second argument, regardless of their capital and small letters. In other words, the arguments are compared in a case-insensitive way.

It applies in three different situations:

  • The arguments are both strings: Returns true if the second string is a case-insensitive substring of the first one
  • The first argument is an array: Returns true if the array passed as first parameter contains a (string) element which is equal to the string passed as second argument, regardless of uppercase and lowercase letters
  • The first argument is a map: Returns true if the second argument contains, an existing, case-insensitive, key of the map

In any other case, this operator will return false.

Rule example:

{
  "description": "",
  "continue": true,
  "active": true,
  "constraint": {
    "WHERE": {
      "type": "containsIgnoreCase",
      "first": "${event.payload.hostname}",
      "second": "Linux"
    },
    "WITH": {}
  },
  "actions": []
}

An event matches this rule if in its payload it has an entry with key "hostname" and whose value is a string that contains "linux", ignoring the case of the strings.

A matching Event is:

{
  "type": "trap",
  "created_ms": 1554130814854,
  "payload": {
    "hostname": "LINUX-server-01"
  }
}

Additional values for hostname that match the rule include: linuX-SERVER-02, LInux-Host-12, Old-LiNuX-FileServer, and so on.

The 'equals', 'ge', 'gt', 'le', 'lt' and 'ne' Operators

The equals, ge, gt, le, lt, ne operators are used to compare two values.

All these operators can work with values of type Number, String, Bool, null and Array.

Warning!

Please be extremely careful when using these operators with numbers of type float. The representation of floating point numbers is often slightly imprecise and can lead to unexpected results (for example, see: https://www.floating-point-gui.de/errors/comparison/).

Example:

{
  "description": "",
  "continue": true,
  "active": true,
  "constraint": {
    "WHERE": {
      "type": "OR",
      "operators": [
        {
          "type": "equals",
          "first": "${event.payload.value}",
          "second": 1000
        },
        {
          "type": "AND",
          "operators": [
            {
              "type": "ge",
              "first": "${event.payload.value}",
              "second": 100
            },
            {
              "type": "le",
              "first": "${event.payload.value}",
              "second": 200
            },
            {
              "type": "ne",
              "first": "${event.payload.value}",
              "second": 150
            },
            {
              "type": "notEquals",
              "first": "${event.payload.value}",
              "second": 160
            }
          ]
        },
        {
          "type": "lt",
          "first": "${event.payload.value}",
          "second": 0
        },
        {
          "type": "gt",
          "first": "${event.payload.value}",
          "second": 2000
        }
      ]
    },
    "WITH": {}
  },
  "actions": []
}

An event matches this rule if event.payload.value exists and one or more of the following conditions hold:

  • It is equal to 1000
  • It is between 100 (inclusive) and 200 (inclusive), but not equal to 150 or to 160
  • It is less than 0 (exclusive)
  • It is greater than 2000 (exclusive)

A matching Event is:

{
  "type": "email",
  "created_ms": 1554130814854,
  "payload": {
    "value": 110
  }
}

Here are some examples showing how these operators behave:

  • [{"id":557}, {"one":"two"}] lt 3: false (cannot compare different types, e.g. here the first is an array and the second is a number)
  • {id: "one"} lt {id: "two"}: false (maps cannot be compared)
  • [["id",557], ["one"]] gt [["id",555], ["two"]]: true (elements in the array are compared recursively from left to right: so here "id" is first compared to "id", then 557 to 555, returning true before attempting to match "one" and "two")
  • [["id",557]] gt [["id",555], ["two"]]: true (elements are compared even if the length of the arrays is not the same)
  • true gt false: true (the value 'true' is evaluated as 1, and the value 'false' as 0; consequently, the expression is equivalent to "1 gt 0" which is true)
  • "twelve" gt "two": false (strings are compared lexically, and 'e' comes before 'o', not after it)

The 'equalsIgnoreCase' Operator

The equalsIgnoreCase operator is used to check whether the strings passed as arguments are equal in a case-insensitive way.

It applies only if both the first and the second arguments are strings. In any other case, the operator will return false.

Rule example:

{
  "description": "",
  "continue": true,
  "active": true,
  "constraint": {
    "WHERE": {
      "type": "equalsIgnoreCase",
      "first": "${event.payload.hostname}",
      "second": "Linux"
    },
    "WITH": {}
  },
  "actions": []
}

An event matches this rule if in its payload it has an entry with key "hostname" and whose value is a string that is equal to "linux", ignoring the case of the strings.

A matching Event is:

{
  "type": "trap",
  "created_ms": 1554130814854,
  "payload": {
    "hostname": "LINUX"
  }
}

The 'regex' Operator

The regex operator is used to check if a string matches a regular expression. The evaluation is performed with the Rust Regex library (see its github project home page )

Rule example:

{
  "description": "",
  "continue": true,
  "active": true,
  "constraint": {
    "WHERE": {
      "type": "regex",
      "regex": "[a-fA-F0-9]",
      "target": "${event.type}"
    },
    "WITH": {}
  },
  "actions": []
}

An event matches this rule if its type matches the regular expression [a-fA-F0-9].

A matching Event is:

{
  "type": "trap0",
  "created_ms": 1554130814854,
  "payload": {}
}

The 'AND', 'OR', and 'NOT' Operators

The and and or operators work on a set of operators, while the not operator works on one single operator. They can be nested recursively to define complex matching rules.

As you would expect:

  • The and operator evaluates to true if all inner operators match
  • The or operator evaluates to true if at least an inner operator matches
  • The not operator evaluates to true if the inner operator does not match, and evaluates to false if the inner operator matches

Example:

{
  "description": "",
  "continue": true,
  "active": true,
  "constraint": {
    "WHERE": {
      "type": "AND",
      "operators": [
        {
          "type": "equals",
          "first": "${event.type}",
          "second": "rsyslog"
        },
        {
          "type": "OR",
          "operators": [
            {
              "type": "equals",
              "first": "${event.payload.body}",
              "second": "something"
            },
            {
              "type": "equals",
              "first": "${event.payload.body}",
              "second": "other"
            }
          ]
        },
        {
          "type": "NOT",
          "operator": {
            "type": "equals",
            "first": "${event.payload.body}",
            "second": "forbidden"
          }
        }
      ]
    },
    "WITH": {}
  },
  "actions": []
}

An event matches this rule if in its payload:

  • The type is "rsyslog"
  • AND an entry with key body whose value is wither "something" OR "other"
  • AND an entry with key body is NOT "forbidden"

A matching Event is:

{
  "type": "rsyslog",
  "created_ms": 1554130814854,
  "payload": {
    "body": "other"
  }
}

A 'Match all Events' Rule

If the WHERE clause is not specified, the Rule evaluates to true for each incoming event.

For example, this Rule generates an "archive" Action for each Event:

{
  "description": "",
  "continue": true,
  "active": true,
  "constraint": {
    "WITH": {}
  },
  "actions": [
    {
      "id": "archive",
      "payload": {
        "event": "${event}",
        "archive_type": "one"
      }
    }
  ]
}

The 'WITH' Clause

The WITH clause generates variables extracted from the Event based on regular expressions. These variables can then be used to populate an Action payload.

All variables declared by a Rule must be resolved, or else the Rule will not be matched.

Two simple rules restrict the access and use of the extracted variables:

  1. Because they are evaluated after the WHERE clause is parsed, any extracted variables declared inside the WITH clause are not accessible by the WHERE clause of the very same rule
  2. A rule can use extracted variables declared by other rules, even in its WHERE clause, provided that:
    • The two rules must belong to the same rule set
    • The rule attempting to use those variables should be executed after the one that declares them
    • The rule that declares the variables should also match the event

The syntax for accessing an extracted variable has the form:

_variables.[.RULE_NAME].VARIABLES_NAME

If the RULE_NAME is omitted, the current rule name is automatically selected.

Example:

{
  "description": "",
  "continue": true,
  "active": true,
  "constraint": {
    "WHERE": {
      "type": "equals",
      "first": "${event.type}",
      "second": "trap"
    },
    "WITH": {
      "sensor_description": {
        "from": "${event.payload.line_5}",
        "regex": {
          "match": "(.*)",
          "group_match_idx": 0
        }
      },
      "sensor_room": {
        "from": "${event.payload.line_6}",
        "regex": {
          "match": "(.*)",
          "group_match_idx": 0
        }
      }
    }
  },
  "actions": [
    {
      "id": "nagios",
      "payload": {
        "host": "bz-outsideserverroom-sensors",
        "service": "motion_sensor_port_4",
        "status": "Critical",
        "host_ip": "${event.payload.host_ip}",
        "room": "${_variables.sensor_room}",
        "message": "${_variables.sensor_description}"
      }
    }
  ]
}

This Rule matches only if its type is "trap" and it is possible to extract the two variables "sensor_description" and "sensor_room" defined in the WITH clause.

An Event that matches this Rule is:

{
  "type": "trap",
  "created_ms": 1554130814854,
  "payload": {
    "host_ip": "10.65.5.31",
    "line_1": "netsensor-outside-serverroom.wp.lan",
    "line_2": "UDP: [10.62.5.31]:161->[10.62.5.115]",
    "line_3": "DISMAN-EVENT-MIB::sysUpTimeInstance 38:10:38:30.98",
    "line_4": "SNMPv2-MIB::snmpTrapOID.0 SNMPv2-SMI::enterprises.14848.0.5",
    "line_5": "SNMPv2-SMI::enterprises.14848.2.1.1.7.0 38:10:38:30.98",
    "line_6": "SNMPv2-SMI::enterprises.14848.2.1.1.2.0 \"Outside Server Room\""
  }
}

It will generate this Action:

    {
  "id": "nagios",
  "payload": {
    "host": "bz-outsideserverroom-sensors",
    "service": "motion_sensor_port_4",
    "status": "Critical",
    "host_ip": "10.65.5.31",
    "room": "SNMPv2-SMI::enterprises.14848.2.1.1.7.0 38:10:38:30.98",
    "message": "SNMPv2-SMI::enterprises.14848.2.1.1.2.0 \"Outside Server Room\""
  }
}

The 'WITH' Clause - Configuration details

As already seen in the previous section, the WITH clause generates variables extracted from the Event using regular expressions. There are multiple ways of configuring those regexes to obtain the desired result.

Common entries to all configurations:

  • from: An expression that determines to which value to apply the extractor regex;
  • modifiers_post: A list of String modifiers to post-process the extracted value. See following section for additional details.

In addition, three parameters combined will define the behavior of an extractor:

  • all_matches: whether the regex will loop through all the matches or only the first one will be considered. Accepted values are true and false. If omitted, it defaults to false

  • match, named_match or single_key_match: a string value representing the regex to be executed. In detail:

    • match is used in case of an index-based regex,
    • named_match is used when named groups are present.
    • single_key_match is used to search in a map for a key that matches the regex. In case of a match, the extracted variable will be the value of the map associated with that key that matched the regex. This match will fail if more than one key matches the defined regex.

    Note that all these values are mutually exclusive.

  • group_match_idx: valid only in case of an index-based regex. It is a positive numeric value that indicates which group of the match has to be extracted. If omitted, an array with all groups is returned.

To show how they work and what is the produced output, from now on, we'll use this hypotetical email body as input:

A critical event has been received:

STATUS: CRITICAL HOSTNAME: MYVALUE2 SERVICENAME: MYVALUE3
STATUS: OK HOSTNAME: MYHOST SERVICENAME: MYVALUE41231

Our objective is to extract from it information about the host status and name, and the service name. We show how using different extractors leads to different results.

Option 1

{
  "server_info": {
    "from": "${event.payload.email.body}",
    "regex": {
      "all_matches": false,
      "match": "STATUS:\\s+(.*)\\s+HOSTNAME:\\s+(.*)SERVICENAME:\\s+(.*)",
      "group_match_idx": 1
    }
  }
}

This extractor:

  • processes only the first match because all_matches is false
  • uses an index-based regex specified by match
  • returns the group of index 1

In this case the output will be the string "CRITICAL".

Please note that, if the group_match_idx was 0, it would have returned "STATUS: CRITICAL HOSTNAME: MYVALUE2 SERVICENAME: MYVALUE3" as in any regex the group with index 0 always represents the full match.

Option 2

{
  "server_info": {
    "from": "${event.payload.email.body}",
    "regex": {
      "all_matches": false,
      "match": "STATUS:\\s+(.*)\\s+HOSTNAME:\\s+(.*)SERVICENAME:\\s+(.*)"
    }
  }
}

This extractor:

  • processes only the first match because all_matches is false
  • uses an index-based regex specified by match
  • returns an array with all groups of the match because group_match_idx is omitted.

In this case the output will be an array of strings:

[
  "STATUS: CRITICAL HOSTNAME: MYVALUE2 SERVICENAME: MYVALUE3",
  "CRITICAL",
  "MYVALUE2",
  "MYVALUE3"
]

Option 3

{
  "server_info": {
    "from": "${event.payload.email.body}",
    "regex": {
      "all_matches": true,
      "match": "STATUS:\\s+(.*)\\s+HOSTNAME:\\s+(.*)SERVICENAME:\\s+(.*)",
      "group_match_idx": 2
    }
  }
}

This extractor:

  • processes all matches because all_matches is true
  • uses an index-based regex specified by match
  • for each match, returns the group of index 2

In this case the output will be an array of strings:

[
  "MYVALUE2", <-- group of index 2 of the first match
  "MYHOST"    <-- group of index 2 of the second match
]

Option 4

{
  "server_info": {
    "from": "${event.payload.email.body}",
    "regex": {
      "all_matches": true,
      "match": "STATUS:\\s+(.*)\\s+HOSTNAME:\\s+(.*)SERVICENAME:\\s+(.*)"
    }
  }
}

This extractor:

  • processes all matches because all_matches is true
  • uses an index-based regex specified by match
  • for each match, returns an array with all groups of the match because group_match_idx is omitted.

In this case the output will be an array of arrays of strings:

[
  [
    "STATUS: CRITICAL HOSTNAME: MYVALUE2 SERVICENAME: MYVALUE3",
    "CRITICAL",
    "MYVALUE2",
    "MYVALUE3"
  ],
  [
    "STATUS: OK HOSTNAME: MYHOST SERVICENAME: MYVALUE41231",
    "OK",
    "MYHOST",
    "MYVALUE41231"
  ]
]

The inner array, in position 0, contains all the groups of the first match while the one in position 1 contains the groups of the second match.

Option 5

{
  "server_info": {
    "from": "${event.payload.email.body}",
    "regex": {
      "named_match": "STATUS:\\s+(?P<STATUS>.*)\\s+HOSTNAME:\\s+(?P<HOSTNAME>.*)SERVICENAME:\\s+(?P<SERVICENAME>.*)"
    }
  }
}

This extractor:

  • processes only the first match because all_matches is omitted
  • uses a regex with named groups specified by named_match

In this case the output is an object where the group names are the property keys:

{
  "STATUS": "CRITICAL",
  "HOSTNAME": "MYVALUE2",
  "SERVICENAME: "MYVALUE3"
}

Option 6

{
  "server_info": {
    "from": "${event.payload.email.body}",
    "regex": {
      "all_matches": true,
      "named_match": "STATUS:\\s+(?P<STATUS>.*)\\s+HOSTNAME:\\s+(?P<HOSTNAME>.*)SERVICENAME:\\s+(?P<SERVICENAME>.*)"
    }
  }
}

This extractor:

  • processes all matches because all_matches is true
  • uses a regex with named groups specified by named_match

In this case the output is an array that contains one object for each match:

[
  {
    "STATUS": "CRITICAL",
    "HOSTNAME": "MYVALUE2",
    "SERVICENAME: "MYVALUE3"
  },
  {
    "STATUS": "OK",
    "HOSTNAME": "MYHOST",
    "SERVICENAME: "MYVALUE41231"
  },
]

The 'WITH' Clause - Post Modifiers

The WITH clause can include a list of String modifiers to post-process the extracted value. The available modifiers are:

  • Lowercase: it converts the resulting String to lower case. Syntax:
       {
           "type": "Lowercase"
       }
  • Map: it maps a string to another string value. Syntax:
      {
            "type": "Map",
            "mapping": {
              "Critical": "2",
              "Warning": "1",
              "Clear": "0",
              "Major": "2",
              "Minor": "1"
            },
            "default_value": "3"
      }
    The default_value is optional; when provided, it is used to map values that do not have a corresponding key in the mapping field. When not provided, the extractor will fail if a specific mapping is not found.
  • ReplaceAll: it returns a new string with all matches of a substring replaced by the new text; the find property is parsed as a regex if is_regex is true, otherwise it is evaluated as a static string. Syntax:
       {
           "type": "ReplaceAll",
           "find": "the string to be found",
           "replace": "to be replaced with",
           "is_regex": false 
       }
    In addition, when is_regex is true, is possible to interpolate the regex captured groups in the replace string, using the $<position> syntax, for example:
     {
         "type": "ReplaceAll",
         "find": "(?P<lastname>[^,\\s]+),\\s+(?P<firstname>\\S+)",
         "replace": "firstname: $2, lastname: $1",
         "is_regex": true 
     }
    Valid forms of the replace field are:
    • extract from event: ${events.payload.hostname_ext}
    • use named groups from regex: $digits and other
    • use group positions from regex: $1 and other
  • ToNumber: it transforms the resulting String into a number. Syntax:
       {
           "type": "ToNumber"
       }
  • Trim: it trims the resulting String. Syntax:
       {
           "type": "Trim"
       }
  • DateAndTime: it converts a timestamp (autodetects if it is in seconds, milliseconds or nanoseconds) to an RFC3339 standard datetime. For example the timestamp 1698933188760, with the Europe/Rome timezone, will become 2023-11-02 14:53:08+01:00 string. Syntax:
       {
           "type": "DateAndTime",
           "timezone": "Europe/Rome" 
       }

A full example of a WITH clause using modifiers is:

{
  "server_info": {
    "from": "${event.payload.email.body}",
    "regex": {
      "all_matches": false,
      "match": "STATUS:\\s+(.*)\\s+HOSTNAME:\\s+(.*)SERVICENAME:\\s+(.*)",
      "group_match_idx": 1
    },
    "modifiers_post": [
      {
        "type": "Lowercase"
      },
      {
        "type": "ReplaceAll",
        "find": "to be found",
        "replace": "to be replaced with",
        "is_regex": false
      },
      {
        "type": "Trim"
      }
    ]
  }
}

This extractor has three modifiers that will be applied to the extracted value. The modifiers are applied in the order they are declared, so the extracted string will be transformed in lowercase, then some text replaced, and finally, the string will be trimmed.

Complete Rule Example 1

An example of a valid Rule in a JSON file is:

{
  "description": "This matches all emails containing a temperature measurement.",
  "continue": true,
  "active": true,
  "constraint": {
    "WHERE": {
      "type": "AND",
      "operators": [
        {
          "type": "equals",
          "first": "${event.type}",
          "second": "email"
        }
      ]
    },
    "WITH": {
      "temperature": {
        "from": "${event.payload.body}",
        "regex": {
          "match": "[0-9]+\\sDegrees",
          "group_match_idx": 0
        }
      }
    }
  },
  "actions": [
    {
      "id": "Logger",
      "payload": {
        "type": "${event.type}",
        "subject": "${event.payload.subject}",
        "temperature:": "The temperature is: ${_variables.temperature} degrees"
      }
    }
  ]
}

This creates a Rule with the following characteristics:

  • Its unique name is 'emails_with_temperature'. There cannot be two rules with the same name.
  • An Event matches this Rule if, as specified in the WHERE clause, it has type "email", and as requested by the WITH clause, it is possible to extract the "temperature" variable from the "event.payload.body" with a non-null value.
  • If an Event meets the previously stated requirements, the matcher produces an Action with id "Logger" and a payload with the three entries type, subject and temperature.