mn8 Language Reference | Index    

Regexp

SUMMARY: NO ATTRIBUTES  NO ELEMENTS  CONSTRUCTORS SUMMARY  NO OPERATORS  METHODS SUMMARYDETAIL: NO ATTRIBUTES  NO ELEMENTS  CONSTRUCTOR DETAILS  NO OPERATORS  METHOD DETAILS

Description

REGEXP denotes the regular expression as used in the Perl5 language. Here is the syntax of Perl5 regular expressions, all of which are supported by the MN8. However, for a definitive reference, you should consult the perlre man page that accompanies the Perl5 distribution.

Perl5 regular expressions consist of:

Quantified Atoms

            Pattern    Description
            -------    --------------------------------------------------------------------
            {n,m}      Match at least n but not more than m times.
            {n,}       Match at least n times.
            {n}        Match exactly n times.
            *          Match 0 or more times.
            +          Match 1 or more times.
            ?          Match 0 or 1 times.

By default, a quantified subpattern is greedy. In other words, it matches as many times as possible without causing the rest of the pattern not to match. To change the quantifiers to match the minimum number of times possible, without causing the rest of the pattern not to match, you may use a "?" right after the quantifier. Perl5 extended regular expressions will be fully supported.

Quantified Atoms with Minimal Matching

            Pattern   Description
            -------   --------------------------------------------------------------------
            {n,m}?    Matches at least n but not more than m times.
            {n,}?     Matches at least n times.
            {n}?      Matches exactly n times.
            *?        Matches 0 or more times.
            +?        Matches 1 or more times.
            ??        Matches 0 or 1 times.

Atoms

            Pattern   Description
            -------   --------------------------------------------------------------------
            .         Matches everything except \n.
            ^         Null token matching the beginning of a string or line (i.e. the
                      position right after a newline or right before the beginning of
                      a string).
            $         Null token matching the end of a string or line (i.e. the position
                      right before a newline or right after the end of a string).
            \b        Null token matching a word boundary (\w on one side and \W on the
                      other).
            \B        Null token matching a boundary that is not a word boundary.
            \A        Matches only at beginning of string.
            \Z        Matches only at end of string (or before newline at the end).
            \n        Newline.
            \r        Carriage return.
            \t        Tab.
            \f        Formfeed.
            \d        Digit [0-9].
            \D        Non-digit [^0-9].
            \w        Word character [0-9a-zA-Z].
            \W        Non-word character [^0-9a-zA-Z].
            \s        A whitespace character [ \t\n\r\f].
            \S        A non-whitespace character [^ \t\n\r\f].
            \xnn      Hexadecimal representation of character.
            \cD       Matches the corresponding control character.
            \nn       Octal representation of character unless a backreference.
            \nnn
            \1,      Matches whatever the first, second, third, etc. parenthesized
            \2,      group matched. This is called a backreference. If there is no
            \3,      corresponding group, the number is interpreted as an octal
            ...      representation of a character.
            \0      Matches null character.

Perl5 Extended Regular Expressions

            Extended Pattern  Description
            ----------------  ------------------------------------------------------------
            (?#text)          An embedded comment causing text to be ignored.
            (?:regexp)        Groups things like "()" but does not cause the group match
                              to be saved.
            (?=regexp)        A zero-width positive lookahead assertion. For example,
                              \w+(?=\s) matches a word followed by whitespace, without
                              including whitespace in the match result.
            (?!regexp)        A zero-width negative lookahead assertion. For example
                              foo(?!bar) matches any occurrence of "foo" that is not
                              followed by "bar". Remember that this is a zero-width
                              assertion, which means that a(?!b)d will match ad because a
                              is followed by a character that is not b (the d) and a d
            (?imsx)           One or more embedded pattern-match modifiers. 
                                - i enables case insensitivity, 
                                - m enables multiline treatment of the input,
                                - s enables single line treatment of the input,
                                - x enables extended whitespace comments.

Usage

Get person name and email address for mail.

            $text = "From: \"Szabo Csaba\" <csaba@nolimits.ro>
            To: zzoli $lt;zzoli@datec.ro>
            Subject: hello
            Organization: noLimits Technologies

            This is a test mail."

            $text = "The Mail"
            $fromExpr = Regexp.create("(?im)^From:(.+)")
            $from =  $fromExpr.getMatches($text)/1
            $result =  Regexp.getMatches("\"(.+)\"\\s+<(.+)>", $from)
            print $result/1/1
                Szabo Csaba
            print $result/1/2
                crow@nolimits.ro

Version: 0.1
Authors:Remus Pereni (http://neuro.nolimits.ro)
Location:
Inherits: Concept, Expression

Constructor List

create (String $pattern)
top

Method List

Logicalcontains (String $str)
static Logicalcontains (String $pattern, String $str)
static SeriesgetMatches (String $pattern, String $str)
SeriesgetMatches (String $str)
StringgetPattern
static Logicalmatches (String $pattern, String $str)
Logicalmatches (String $str)
setPattern (String $pattern)
top
Methods inherited from: Concept
cloneConcept, extendsConcept, fromXML, getAllInheritedConcepts, getConceptAttribute, getConceptAttributeField, getConceptAttributeFields, getConceptAttributes, getConceptConstructors, getConceptElement, getConceptElementField, getConceptElementFields, getConceptElements, getConceptLabel, getConceptMethod, getConceptMethods, getConceptOperators, getConceptType, getErrorHandler, getInheritedConcepts, getResourceURI, hasConceptAttribute, hasConceptElement, hasConceptMethod, hasPath, isHidden, loadContent, setConceptLabel, setErrorHandler, setHidden, setShowEmpty, showEmpty, toTXT, toXML, setResourceURI
Methods inherited from: Expression
contains, contains, getMatches, getMatches, getPattern, matches, matches, setPattern

Detailed Constructor Info

create (String $pattern)
Parameters:
$pattern :The pattern with which this Regexp will be created.

Creates a new Regexp concept with the given $pattern.

top

Detailed Method Info

contains (String $str)
Parameters:
$str :Any string.
Returns: Logical

Returns true if this concept contains the given string, false otherwise.

            $expr = Regexp.create("(?im)is (.+?).")
            print $expr.contains("this is a test")
            -- the result is --
            true
            

top
static contains (String $pattern, String $str)
Parameters:
$pattern :Any pattern.
$str :Any string.
Returns: Logical

Returns true if this concept contains the given string by the specified pattern, false otherwise.

            print Regexp.contains("(?im)is (.+?).", "this is a test")
            -- the result is --
            true
            

top
static getMatches (String $pattern, String $str)
Parameters:
$pattern :Any pattern.
$str :Any string.
Returns: Series

Returns a series which contains all matches between this concept and the given string by the specified pattern.

            print Regexp.getMatches("(?im)test, (.+?) test\\.", "this is a test, a regexp test.")
            -- the result is --
            a regexp
            

top
getMatches (String $str)
Parameters:
$str :Any string.
Returns: Series

Returns a series which contains all matches between this concept and the given string.

            $expr = Regexp.create("(?im)test, (.+?) test\\.")
            print $expr.getMatches("this is a test, a regexp test.")
            -- the result is --
            a regexp
            

top
getPattern
Returns: String

Returns this Regexp concept pattern.

            $expr = Regexp.create("(?im)test, (.+?) test\\.")
            print $expr.getPattern
            -- the result is --
            (?im)test, (.+?) test\\.
            

top
static matches (String $pattern, String $str)
Parameters:
$pattern :Any pattern.
$str :Any string.
Returns: Logical

Returns true if this concept matches with the given string by the specified pattern.

            print Regexp.matches("(?im)this (.+?) test\\.", "this is a test, a regexp test.")
            -- the result is --
            true
            

top
matches (String $str)
Parameters:
$str :Any string.
Returns: Logical

Returns true if this Regexp concept matches with the given string, false otherwise.

            $expr = Regexp.create("(?im)this (.+?) test\\.")
            print $expr.matches("this is a test, a regexp test.")
            -- the result is --
            true
            

top
setPattern (String $pattern)
Parameters:
$pattern :The new pattern to set.
Returns:

Sets this Regexp concept pattern to the given value.

            $expr = Regexp.create("(?im)test, (.+?) test\\.")
            print $expr.getPattern
            $expr.setPattern("(?ism)this (.*?) test\\.")
            print $expr.getPattern
            -- the result is --
            (?im)test, (.+?) test\\.
            (?ism)this (.*?) test\\.
            

top