Lexical Scanner Generator used with Racc for Ruby
rex [options] grammarfile
-o --output-file filename designate output filename
-s --stub append stub main for debugging
-i --ignorecase ignore character case
-C --check-only only check syntax
--independent independent mode
-d --debug print debug information
-h --help print usage
--version print version
--copyright print copyright
The destination file for foo.rex is foo.rex.rb.
To use, include the following in the Ruby source code file.
require 'foo.rex'
A definition consists of a header section, a rule section,
and a footer section. The rule section includes one or more clauses.
Each clause starts with a keyword.
Summary:
[Header section]
"class" Foo
["option"
[options] ]
["inner"
[methods] ]
["macro"
[macro-name regular-expression] ]
"rule"
[start-state] pattern [actions]
"end"
[Footer section]
class Foo
macro
BLANK \s+
DIGIT \d+
rule
{BLANK}
{DIGIT} { [:NUMBER, text.to_i] }
. { [text, text] }
end
All of the contents described before the definitions in the rule section are
copied to the beginning of the output file.
All the contents described after the definitions in the rule section are
copied to the end of the output file.
The rule section starts at the line beginning with the "class" keyword
and ends at the line beginning with the "end" keyword.
The class name is specified after the "class" keyword.
If a module name is specified, the class will be included in the module.
A class that inherits Racc::Parser is generated.
class Foo
class Bar::Foo
This section begins with the "option" keyword.
"ignorecase" ignore the character case when pattern matching
"stub" append stub main for debugging
"independent" independent mode, do not inherit Racc.
This section begins with the "inner" keyword.
The contents defined here are defined by the contents of the class
of the generated scanner.
This section begins with the "macro" keyword.
A name is assigned to one regular expression.
A space character (0x20) can be included by using a backslash \ to escape.
DIGIT \d+
IDENT [a-zA-Z_][a-zA-Z0-9_]*
BLANK [\ \t]+
REMIN \/\*
REMOUT \*\/
This section begins with the "rule" keyword.
[state] pattern [actions]
A start state is indicated by an identifier beginning with ":", a Ruby symbol.
If uppercase letters follow the ":", the state becomes an exclusive start state.
If lowercase letters follow the ":", the state becomes an inclusive start state.
The initial value and the default value of a start state are nil.
A regular expression specifies a character string.
A regular expression description may include a macro definition enclosed
by curly braces { }.
A macro definition is used when the regular expression includes whitespace.
An action is executed when the pattern is matched.
The action defines the process for creating the appropriate token.
A token is a two-element array containing a type and a value, or is nil.
The following elements can be used to create a token.
lineno Line number ( Read Only )
text Matched string ( Read Only )
state Start state ( Read/Write )
The action is a block of Ruby code enclosed by { }.
Do not use functions that exit the block and change the control flow.
( return, exit, next, break, ... )
If the action is omitted, the matched character string is discarded,
and the process advances to the next scan.
{REMIN} { self.state = :REM ; [:REM_IN, text] }
:REM {REMOUT} { self.state = nil ; [:REM_OUT, text] }
:REM (.+)(?={REMOUT}) { [:COMMENT, text] }
{BLANK}
-?{DIGIT} { [:NUMBER, text.to_i] }
{WORD} { [:word, text] }
. { [text, text] }
Any text following a "#" to the end of the line becomes a comment.
Initializes the scanner with the str string argument.
This is redefined and used.
Parses the string described in the defined grammar.
The tokens are stored internally.
Reads in a file described in the defined grammar.
The tokens are stored internally.
The tokens stored internally are extracted one by one.
When there are no more tokens, nil is returned.
This specification is provisional and may be changed without prior notice.