Module rspamd_regexp
Module rspamd_regexp
Rspamd regexp is an utility module that handles rspamd perl compatible regular expressions
Example:
local rspamd_regexp = require "rspamd_regexp"
local re = rspamd_regexp.create_cached('/^\\s*some_string\\s*$/i')
re:match('some_string')
local re = rspamd_regexp.create_cached('/\\s+/i')
re:split('word word word') -- returns ['word', 'word', 'word']
Brief content:
Functions:
Function | Description |
---|---|
rspamd_regexp.create(pattern[, flags]) | Creates new rspamd_regexp. |
rspamd_regexp.import_glob(glob_pattern[, flags]) | Creates new rspamd_regexp from glob. |
rspamd_regexp.import_plain(plain_string[, flags]) | Creates new rspamd_regexp from plain string (escaping specials). |
rspamd_regexp.get_cached(pattern) | This function gets cached and pre-compiled regexp created by either create . |
rspamd_regexp.create_cached(pattern[, flags]) | This function is similar to create but it tries to search for regexp in the. |
Methods:
Method | Description |
---|---|
re:get_pattern() | Get a pattern for specified regexp object. |
re:set_limit(lim) | Set maximum size of text length to be matched with this regexp (if lim is. |
re:set_max_hits(lim) | Set maximum number of hits returned by a regexp. |
re:get_max_hits(lim) | Get maximum number of hits returned by a regexp. |
re:search(line[, raw[, capture]]) | Search line in regular expression object. |
re:match(line[, raw_match]) | Matches line against the regular expression and return true if line matches. |
re:matchn(line, max_matches, [, raw_match]) | Matches line against the regular expression and return number of matches if line matches. |
re:split(line) | Split line using the specified regular expression. |
re:destroy() | Destroy regexp from caches if needed (the pointer is removed by garbage collector). |
Functions
The module rspamd_regexp
defines the following functions.
Function rspamd_regexp.create(pattern[, flags])
Creates new rspamd_regexp
Parameters:
pattern {string}
: pattern to build regexp. If this pattern is enclosed in//
then it is possible to specify flags after itflags {string}
: optional flags to create regular expression
Returns:
{regexp}
: regexp argument that is not automatically destroyed
Example:
local regexp = require "rspamd_regexp"
local re = regexp.create('/^test.*[0-9]\\s*$/i')
Back to module description.
Function rspamd_regexp.import_glob(glob_pattern[, flags])
Creates new rspamd_regexp from glob
Parameters:
pattern {string}
: pattern to build regexp.flags {string}
: optional flags to create regular expression
Returns:
{regexp}
: regexp argument that is not automatically destroyed
Example:
local regexp = require "rspamd_regexp"
local re = regexp.import_glob('ab*', 'i')
Back to module description.
Function rspamd_regexp.import_plain(plain_string[, flags])
Creates new rspamd_regexp from plain string (escaping specials)
Parameters:
pattern {string}
: pattern to build regexp.flags {string}
: optional flags to create regular expression
Returns:
{regexp}
: regexp argument that is not automatically destroyed
Example:
local regexp = require "rspamd_regexp"
local re = regexp.import_plain('exact_string_with*', 'i')
Back to module description.
Function rspamd_regexp.get_cached(pattern)
This function gets cached and pre-compiled regexp created by either create
or create_cached
methods. If no cached regexp is found then nil
is returned.
Parameters:
pattern {string}
: regexp pattern
Returns:
{regexp}
: cached regexp structure ornil
Back to module description.
Function rspamd_regexp.create_cached(pattern[, flags])
This function is similar to create
but it tries to search for regexp in the
cache first.
Parameters:
pattern {string}
: pattern to build regexp. If this pattern is enclosed in//
then it is possible to specify flags after itflags {string}
: optional flags to create regular expression
Returns:
{regexp}
: regexp argument that is not automatically destroyed
Example:
local regexp = require "rspamd_regexp"
local re = regexp.create_cached('/^test.*[0-9]\\s*$/i')
...
-- This doesn't create new regexp object
local other_re = regexp.create_cached('/^test.*[0-9]\\s*$/i')
Back to module description.
Methods
The module rspamd_regexp
defines the following methods.
Method re:get_pattern()
Get a pattern for specified regexp object
Parameters:
No parameters
Returns:
{string}
: pattern line
Back to module description.
Method re:set_limit(lim)
Set maximum size of text length to be matched with this regexp (if lim
is
less or equal to zero then all texts are checked)
Parameters:
lim {number}
: limit in bytes
Returns:
No return
Back to module description.
Method re:set_max_hits(lim)
Set maximum number of hits returned by a regexp
Parameters:
lim {number}
: limit in hits count
Returns:
{number}
: old number of max hits
Back to module description.
Method re:get_max_hits(lim)
Get maximum number of hits returned by a regexp
Parameters:
No parameters
Returns:
{number}
: number of max hits
Back to module description.
Method re:search(line[, raw[, capture]])
Search line in regular expression object. If line matches then this
function returns the table of captured strings. Otherwise, nil is returned.
If raw
is specified, then input is treated as raw data not encoded in utf-8
.
If capture
is true, then this function saves all captures to the table of
values, so the first element is the whole matched string and the
subsequent elements are ordered captures defined within pattern.
Parameters:
line {string}
: match the specified line against regexp objectmatch {bool}
: raw regexp instead of utf8 onecapture {bool}
: perform subpatterns capturing
Returns:
{table or nil}
: table of strings or tables (ifcapture
is true) or nil if not matched
Example:
local re = regexp.create_cached('/^\s*([0-9]+)\s*$/')
-- returns nil
local m1 = re:search('blah')
local m2 = re:search(' 190 ')
-- prints ' 190 '
print(m2[1])
local m3 = re:search(' 100500 ')
-- prints ' 100500 '
print(m3[1][1])
-- prints '100500' capture
print(m3[1][2])
Back to module description.
Method re:match(line[, raw_match])
Matches line against the regular expression and return true if line matches (partially or completely)
Parameters:
line {string}
: match the specified line against regexp objectmatch {bool}
: raw regexp instead of utf8 one
Returns:
{bool}
: true ifline
matches
Back to module description.
Method re:matchn(line, max_matches, [, raw_match])
Matches line against the regular expression and return number of matches if line matches
(partially or completely). This process stop when max_matches
is reached.
If max_matches
is zero, then only a single match is counted which is equal to
re:match
If max_matches
is negative, then all matches are considered.
Parameters:
line {string}
: match the specified line against regexp objectmax_matches {number}
: maximum number of matchesmatch {bool}
: raw regexp instead of utf8 one
Returns:
{number}
: number of matches found in theline
argument
Back to module description.
Method re:split(line)
Split line using the specified regular expression. Breaks the string on the pattern, and returns an array of the tokens. If the pattern contains capturing parentheses, then the text for each of the substrings will also be returned. If the pattern does not match anywhere in the string, then the whole string is returned as the first token.
Parameters:
line {string/text}
: line to split
Returns:
{table}
: table of split line portions (if text was the input, then text is used for return parts)
Back to module description.
Method re:destroy()
Destroy regexp from caches if needed (the pointer is removed by garbage collector)
Parameters:
No parameters
Returns:
No return
Back to module description.
Back to top.