Basics of regexp processing in traps
Essentially, everything that has been passed as SNMP variables, along with SNMP trap, is studied as text. Regexp is basically a sequence of character sequences (patterns) and special characters, that is tried in known order. If pattern matches, its processing immediately stops and returns success. There are two regexp fields in trap definition:- if both “accepted” and “ignored” patterns are empty (not set, by default), trap is accepted
- if “ignored” pattern is set and matches, trap is ignored, regardless of whether “accepted” matches
- if “accepted” is set and matches, trap is accepted
- if “accepted” is empty, trap is accepted, otherwise ignored
Regexp structure basics
Regexp patterns consists of basic patterns tried from left to right. Basic pattern can be a single character, a group, or a character class, with possible repetition modifier. Special characters in regexp are \ . ? + * [ ] ( | ) { } ^ $ Everything else is just compared as is. So, if regexp isabc123it means: match, if string abc123 is found exactly anywhere in variable. Special character \ (“back slash”) means: treat character immediately following as non-special character. So, if you need to match string “$123.00”, the pattern will look like
\$123\.00Dot . means “match any single character”. So, if pattern is
123.45then any of the following will match it:
123.45 123-45 123+45 123x45and so on. Square brackets [ ] define character class: class matches any single character from those within brackets. To allow defining sequences of possible characters, minus sign is used. If minus sign is encountered between two characters, it means “any character from ASCII table from left to right”. If you need to allow minus sign itself, place it first. So,
[a-d0-9z]will match any character from: sequence a to d (a,b,c or d), any decimal digit and letter z. Caret ^, if placed as first character in character class, means “anything but”, i.e., will cause character match anything not included, i.e.
[^0-9]will match anything that is not a decimal digit. Parentheses ( ) and vertical line (“pipe”) | are used to try variants (“groups“). Everything enclosed in parentheses and separated by vertical line is tried left to right; if any variant matches, the entire group matches. So, pattern
(true|false|any) viewwill match only one of following:
true view false view any viewanywhere in payload. Groups may be nested, and there can be arbitrary number of groups in regexp. Asterisk *, question mark ?, plus sign + or comma-separated integers within braces {M,N} define repetitions:
- Asterisk * means zero, or any positive number of pattern to the left
- Question mark ? means zero, or exactly one of pattern to the left
- Plus sign + means one or more of pattern to the left
- Integers {M,N} in curly braces mean the pattern to the left should be found any amount of times between M and N (inclusive). If second integer and comma are omitted, it means “M or more times”
(ab|cd}{2}[123]+will match
abcdabab112(four repetitions of first group, and three repetitions of character class) but will not match
cdcd4Note: the asterisk is “greedy”, it will try to match as many characters as possible, thus it can make further patterns fail. Place question mark immediately after asterisk to turn “greediness” off, and force regexp to use as little match to asterisk as possible, to match the entire regexp. Thus, pattern
abc.*?defmeans “match, if abc is found, followed by def anywhere further”. Note that without question mark the above would never match (since asterisk would consume any characters (dot) up to the end). Anchors – caret ^ and dollar sign $ – are used to specify start or end of string, respectively (thus caret may only be placed, if outside character class, as first character of regexp, and dollar sign may only be the last). Regexp
^Firstwill only match if payload starts from string First, and
finish$will only match if payload ends with string finish. If you need the mentioned special characters be treated as ordinary characters to match, use back slash immediately in front of them. Note: if regexp is malformed (inconsistent structure), it will always fail. So, concluding this short introduction, the pattern on example above
(true|1|yes)will only match if anywhere in payload is found one of true, 1 or yes strings.