android13/external/antlr/runtime/Ruby/lib/antlr3/streams.rb

#!/usr/bin/ruby
# encoding: utf-8

=begin LICENSE

[The "BSD licence"]
Copyright (c) 2009-2010 Kyle Yetter
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

 1. Redistributions of source code must retain the above copyright
    notice, this list of conditions and the following disclaimer.
 2. Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions and the following disclaimer in the
    documentation and/or other materials provided with the distribution.
 3. The name of the author may not be used to endorse or promote products
    derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

=end

module ANTLR3


=begin rdoc ANTLR3::Stream

= ANTLR3 Streams

This documentation first covers the general concept of streams as used by ANTLR
recognizers, and then discusses the specific <tt>ANTLR3::Stream</tt> module.

== ANTLR Stream Classes

ANTLR recognizers need a way to walk through input data in a serialized IO-style
fashion. They also need some book-keeping about the input to provide useful
information to developers, such as current line number and column. Furthermore,
to implement backtracking and various error recovery techniques, recognizers
need a way to record various locations in the input at a number of points in the
recognition process so the input state may be restored back to a prior state.

ANTLR bundles all of this functionality into a number of Stream classes, each
designed to be used by recognizers for a specific recognition task. Most of the
Stream hierarchy is implemented in antlr3/stream.rb, which is loaded by default
when 'antlr3' is required.

---

Here's a brief overview of the various stream classes and their respective
purpose:

StringStream::
  Similar to StringIO from the standard Ruby library, StringStream wraps raw
  String data in a Stream interface for use by ANTLR lexers.
FileStream::
  A subclass of StringStream, FileStream simply wraps data read from an IO or
  File object for use by lexers.
CommonTokenStream::
  The job of a TokenStream is to read lexer output and then provide ANTLR
  parsers with the means to sequential walk through series of tokens.
  CommonTokenStream is the default TokenStream implementation.
TokenRewriteStream::
  A subclass of CommonTokenStream, TokenRewriteStreams provide rewriting-parsers
  the ability to produce new output text from an input token-sequence by
  managing rewrite "programs" on top of the stream.
CommonTreeNodeStream::
  In a similar fashion to CommonTokenStream, CommonTreeNodeStream feeds tokens
  to recognizers in a sequential fashion. However, the stream object serializes
  an Abstract Syntax Tree into a flat, one-dimensional sequence, but preserves
  the two-dimensional shape of the tree using special UP and DOWN tokens. The
  sequence is primarily used by ANTLR Tree Parsers. *note* -- this is not
  defined in antlr3/stream.rb, but antlr3/tree.rb

---

The next few sections cover the most significant methods of all stream classes.

=== consume / look / peek

<tt>stream.consume</tt> is used to advance a stream one unit. StringStreams are
advanced by one character and TokenStreams are advanced by one token.

<tt>stream.peek(k = 1)</tt> is used to quickly retrieve the object of interest
to a recognizer at look-ahead position specified by <tt>k</tt>. For
<b>StringStreams</b>, this is the <i>integer value of the character</i>
<tt>k</tt> characters ahead of the stream cursor. For <b>TokenStreams</b>, this
is the <i>integer token type of the token</i> <tt>k</tt> tokens ahead of the
stream cursor.

<tt>stream.look(k = 1)</tt> is used to retrieve the full object of interest at
look-ahead position specified by <tt>k</tt>. While <tt>peek</tt> provides the
<i>bare-minimum lightweight information</i> that the recognizer needs,
<tt>look</tt> provides the <i>full object of concern</i> in the stream. For
<b>StringStreams</b>, this is a <i>string object containing the single
character</i> <tt>k</tt> characters ahead of the stream cursor. For
<b>TokenStreams</b>, this is the <i>full token structure</i> <tt>k</tt> tokens
ahead of the stream cursor.

<b>Note:</b> in most ANTLR runtime APIs for other languages, <tt>peek</tt> is
implemented by some method with a name like <tt>LA(k)</tt> and <tt>look</tt> is
implemented by some method with a name like <tt>LT(k)</tt>. When writing this
Ruby runtime API, I found this naming practice both confusing, ambiguous, and
un-Ruby-like. Thus, I chose <tt>peek</tt> and <tt>look</tt> to represent a
quick-look (peek) and a full-fledged look-ahead operation (look). If this causes
confusion or any sort of compatibility strife for developers using this
implementation, all apologies.

=== mark / rewind / release

<tt>marker = stream.mark</tt> causes the stream to record important information
about the current stream state, place the data in an internal memory table, and
return a memento, <tt>marker</tt>. The marker object is typically an integer key
to the stream's internal memory table.

Used in tandem with, <tt>stream.rewind(mark = last_marker)</tt>, the marker can
be used to restore the stream to an earlier state. This is used by recognizers
to perform tasks such as backtracking and error recovery.

<tt>stream.release(marker = last_marker)</tt> can be used to release an existing
state marker from the memory table.

=== seek

<tt>stream.seek(position)</tt> moves the stream cursor to an absolute position
within the stream, basically like typical ruby <tt>IO#seek</tt> style methods.
However, unlike <tt>IO#seek</tt>, ANTLR streams currently always use absolute
position seeking.

== The Stream Module

<tt>ANTLR3::Stream</tt> is an abstract-ish base mixin for all IO-like stream
classes used by ANTLR recognizers.

The module doesn't do much on its own besides define arguably annoying
``abstract'' pseudo-methods that demand implementation when it is mixed in to a
class that wants to be a Stream. Right now this exists as an artifact of porting
the ANTLR Java/Python runtime library to Ruby. In Java, of course, this is
represented as an interface. In Ruby, however, objects are duck-typed and
interfaces aren't that useful as programmatic entities -- in fact, it's mildly
wasteful to have a module like this hanging out. Thus, I may axe it.

When mixed in, it does give the class a #size and #source_name attribute
methods.

Except in a small handful of places, most of the ANTLR runtime library uses
duck-typing and not type checking on objects. This means that the methods which
manipulate stream objects don't usually bother checking that the object is a
Stream and assume that the object implements the proper stream interface. Thus,
it is not strictly necessary that custom stream objects include ANTLR3::Stream,
though it isn't a bad idea.

=end

module Stream
  include ANTLR3::Constants
  extend ClassMacros

  ##
  # :method: consume
  # used to advance a stream one unit (such as character or token)
  abstract :consume

  ##
  # :method: peek( k = 1 )
  # used to quickly retreive the object of interest to a recognizer at lookahead
  # position specified by <tt>k</tt> (such as integer value of a character or an
  # integer token type)
  abstract :peek

  ##
  # :method: look( k = 1 )
  # used to retreive the full object of interest at lookahead position specified
  # by <tt>k</tt> (such as a character string or a token structure)
  abstract :look

  ##
  # :method: mark
  # saves the current position for the purposes of backtracking and
  # returns a value to pass to #rewind at a later time
  abstract :mark

  ##
  # :method: index
  # returns the current position of the stream
  abstract :index

  ##
  # :method: rewind( marker = last_marker )
  # restores the stream position using the state information previously saved
  # by the given marker
  abstract :rewind

  ##
  # :method: release( marker = last_marker )
  # clears the saved state information associated with the given marker value
  abstract :release

  ##
  # :method: seek( position )
  # move the stream to the given absolute index given by +position+
  abstract :seek

  ##
  # the total number of symbols in the stream
  attr_reader :size

  ##
  # indicates an identifying name for the stream -- usually the file path of the input
  attr_accessor :source_name
end

=begin rdoc ANTLR3::CharacterStream

CharacterStream further extends the abstract-ish base mixin Stream to add
methods specific to navigating character-based input data. Thus, it serves as an
immitation of the Java interface for text-based streams, which are primarily
used by lexers.

It adds the ``abstract'' method, <tt>substring(start, stop)</tt>, which must be
implemented to return a slice of the input string from position <tt>start</tt>
to position <tt>stop</tt>. It also adds attribute accessor methods <tt>line</tt>
and <tt>column</tt>, which are expected to indicate the current line number and
position within the current line, respectively.

== A Word About <tt>line</tt> and <tt>column</tt> attributes

Presumably, the concept of <tt>line</tt> and <tt>column</tt> attirbutes of text
are familliar to most developers. Line numbers of text are indexed from number 1
up (not 0). Column numbers are indexed from 0 up. Thus, examining sample text:

  Hey this is the first line.
  Oh, and this is the second line.

Line 1 is the string "Hey this is the first line\\n". If a character stream is at
line 2, character 0, the stream cursor is sitting between the characters "\\n"
and "O".

*Note:* most ANTLR runtime APIs for other languages refer to <tt>column</tt>
with the more-precise, but lengthy name <tt>charPositionInLine</tt>. I prefered
to keep it simple and familliar in this Ruby runtime API.

=end

module CharacterStream
  include Stream
  extend ClassMacros
  include Constants

  ##
  # :method: substring(start,stop)
  abstract :substring

  attr_accessor :line
  attr_accessor :column
end


=begin rdoc ANTLR3::TokenStream

TokenStream further extends the abstract-ish base mixin Stream to add methods
specific to navigating token sequences. Thus, it serves as an imitation of the
Java interface for token-based streams, which are used by many different
components in ANTLR, including parsers and tree parsers.

== Token Streams

Token streams wrap a sequence of token objects produced by some token source,
usually a lexer. They provide the operations required by higher-level
recognizers, such as parsers and tree parsers for navigating through the
sequence of tokens. Unlike simple character-based streams, such as StringStream,
token-based streams have an additional level of complexity because they must
manage the task of "tuning" to a specific token channel.

One of the main advantages of ANTLR-based recognition is the token
<i>channel</i> feature, which allows you to hold on to all tokens of interest
while only presenting a specific set of interesting tokens to a parser. For
example, if you need to hide whitespace and comments from a parser, but hang on
to them for some other purpose, you have the lexer assign the comments and
whitespace to channel value HIDDEN as it creates the tokens.

When you create a token stream, you can tune it to some specific channel value.
Then, all <tt>peek</tt>, <tt>look</tt>, and <tt>consume</tt> operations only
yield tokens that have the same value for <tt>channel</tt>. The stream skips
over any non-matching tokens in between.

== The TokenStream Interface

In addition to the abstract methods and attribute methods provided by the base
Stream module, TokenStream adds a number of additional method implementation
requirements and attributes.

=end

module TokenStream
  include Stream
  extend ClassMacros

  ##
  # expected to return the token source object (such as a lexer) from which
  # all tokens in the stream were retreived
  attr_reader :token_source

  ##
  # expected to return the value of the last marker produced by a call to
  # <tt>stream.mark</tt>
  attr_reader :last_marker

  ##
  # expected to return the integer index of the stream cursor
  attr_reader :position

  ##
  # the integer channel value to which the stream is ``tuned''
  attr_accessor :channel

  ##
  # :method: to_s(start=0,stop=tokens.length-1)
  # should take the tokens between start and stop in the sequence, extract their text
  # and return the concatenation of all the text chunks
  abstract :to_s

  ##
  # :method: at( i )
  # return the stream symbol at index +i+
  abstract :at
end

=begin rdoc ANTLR3::StringStream

A StringStream's purpose is to wrap the basic, naked text input of a recognition
system. Like all other stream types, it provides serial navigation of the input;
a recognizer can arbitrarily step forward and backward through the stream's
symbols as it requires. StringStream and its subclasses are they main way to
feed text input into an ANTLR Lexer for token processing.

The stream's symbols of interest, of course, are character values. Thus, the
#peek method returns the integer character value at look-ahead position
<tt>k</tt> and the #look method returns the character value as a +String+. They
also track various pieces of information such as the line and column numbers at
the current position.

=== Note About Text Encoding

This version of the runtime library primarily targets ruby version 1.8, which
does not have strong built-in support for multi-byte character encodings. Thus,
characters are assumed to be represented by a single byte -- an integer between
0 and 255. Ruby 1.9 does provide built-in encoding support for multi-byte
characters, but currently this library does not provide any streams to handle
non-ASCII encoding. However, encoding-savvy recognition code is a future
development goal for this project.

=end

class StringStream
  NEWLINE = ?\n.ord

  include CharacterStream

  # current integer character index of the stream
  attr_reader :position

  # the current line number of the input, indexed upward from 1
  attr_reader :line

  # the current character position within the current line, indexed upward from 0
  attr_reader :column

  # the name associated with the stream -- usually a file name
  # defaults to <tt>"(string)"</tt>
  attr_accessor :name

  # the entire string that is wrapped by the stream
  attr_reader :data
  attr_reader :string

  if RUBY_VERSION =~ /^1\.9/

    # creates a new StringStream object where +data+ is the string data to stream.
    # accepts the following options in a symbol-to-value hash:
    #
    # [:file or :name] the (file) name to associate with the stream; default: <tt>'(string)'</tt>
    # [:line] the initial line number; default: +1+
    # [:column] the initial column number; default: +0+
    #
    def initialize( data, options = {} )      # for 1.9
      @string   = data.to_s.encode( Encoding::UTF_8 ).freeze
      @data     = @string.codepoints.to_a.freeze
      @position = options.fetch :position, 0
      @line     = options.fetch :line, 1
      @column   = options.fetch :column, 0
      @markers  = []
      @name   ||= options[ :file ] || options[ :name ] # || '(string)'
      mark
    end

    #
    # identical to #peek, except it returns the character value as a String
    #
    def look( k = 1 )               # for 1.9
      k == 0 and return nil
      k += 1 if k < 0

      index = @position + k - 1
      index < 0 and return nil

      @string[ index ]
    end

  else

    # creates a new StringStream object where +data+ is the string data to stream.
    # accepts the following options in a symbol-to-value hash:
    #
    # [:file or :name] the (file) name to associate with the stream; default: <tt>'(string)'</tt>
    # [:line] the initial line number; default: +1+
    # [:column] the initial column number; default: +0+
    #
    def initialize( data, options = {} )    # for 1.8
      @data = data.to_s
      @data.equal?( data ) and @data = @data.clone
      @data.freeze
      @string = @data
      @position = options.fetch :position, 0
      @line = options.fetch :line, 1
      @column = options.fetch :column, 0
      @markers = []
      @name ||= options[ :file ] || options[ :name ] # || '(string)'
      mark
    end

    #
    # identical to #peek, except it returns the character value as a String
    #
    def look( k = 1 )                        # for 1.8
      k == 0 and return nil
      k += 1 if k < 0

      index = @position + k - 1
      index < 0 and return nil

      c = @data[ index ] and c.chr
    end

  end

  def size
    @data.length
  end

  alias length size

  #
  # rewinds the stream back to the start and clears out any existing marker entries
  #
  def reset
    initial_location = @markers.first
    @position, @line, @column = initial_location
    @markers.clear
    @markers << initial_location
    return self
  end

  #
  # advance the stream by one character; returns the character consumed
  #
  def consume
    c = @data[ @position ] || EOF
    if @position < @data.length
      @column += 1
      if c == NEWLINE
        @line += 1
        @column = 0
      end
      @position += 1
    end
    return( c )
  end

  #
  # return the character at look-ahead distance +k+ as an integer. <tt>k = 1</tt> represents
  # the current character. +k+ greater than 1 represents upcoming characters. A negative
  # value of +k+ returns previous characters consumed, where <tt>k = -1</tt> is the last
  # character consumed. <tt>k = 0</tt> has undefined behavior and returns +nil+
  #
  def peek( k = 1 )
    k == 0 and return nil
    k += 1 if k < 0
    index = @position + k - 1
    index < 0 and return nil
    @data[ index ] or EOF
  end

  #
  # return a substring around the stream cursor at a distance +k+
  # if <tt>k >= 0</tt>, return the next k characters
  # if <tt>k < 0</tt>, return the previous <tt>|k|</tt> characters
  #
  def through( k )
    if k >= 0 then @string[ @position, k ] else
      start = ( @position + k ).at_least( 0 ) # start cannot be negative or index will wrap around
      @string[ start ... @position ]
    end
  end

  # operator style look-ahead
  alias >> look

  # operator style look-behind
  def <<( k )
    self << -k
  end

  alias index position
  alias character_index position

  alias source_name name

  #
  # Returns true if the stream appears to be at the beginning of a new line.
  # This is an extra utility method for use inside lexer actions if needed.
  #
  def beginning_of_line?
    @position.zero? or @data[ @position - 1 ] == NEWLINE
  end

  #
  # Returns true if the stream appears to be at the end of a new line.
  # This is an extra utility method for use inside lexer actions if needed.
  #
  def end_of_line?
    @data[ @position ] == NEWLINE #if @position < @data.length
  end

  #
  # Returns true if the stream has been exhausted.
  # This is an extra utility method for use inside lexer actions if needed.
  #
  def end_of_string?
    @position >= @data.length
  end

  #
  # Returns true if the stream appears to be at the beginning of a stream (position = 0).
  # This is an extra utility method for use inside lexer actions if needed.
  #
  def beginning_of_string?
    @position == 0
  end

  alias eof? end_of_string?
  alias bof? beginning_of_string?

  #
  # record the current stream location parameters in the stream's marker table and
  # return an integer-valued bookmark that may be used to restore the stream's
  # position with the #rewind method. This method is used to implement backtracking.
  #
  def mark
    state = [ @position, @line, @column ].freeze
    @markers << state
    return @markers.length - 1
  end

  #
  # restore the stream to an earlier location recorded by #mark. If no marker value is
  # provided, the last marker generated by #mark will be used.
  #
  def rewind( marker = @markers.length - 1, release = true )
    ( marker >= 0 and location = @markers[ marker ] ) or return( self )
    @position, @line, @column = location
    release( marker ) if release
    return self
  end

  #
  # the total number of markers currently in existence
  #
  def mark_depth
    @markers.length
  end

  #
  # the last marker value created by a call to #mark
  #
  def last_marker
    @markers.length - 1
  end

  #
  # let go of the bookmark data for the marker and all marker
  # values created after the marker.
  #
  def release( marker = @markers.length - 1 )
    marker.between?( 1, @markers.length - 1 ) or return
    @markers.pop( @markers.length - marker )
    return self
  end

  #
  # jump to the absolute position value given by +index+.
  # note: if +index+ is before the current position, the +line+ and +column+
  #       attributes of the stream will probably be incorrect
  #
  def seek( index )
    index = index.bound( 0, @data.length )  # ensures index is within the stream's range
    if index > @position
      skipped = through( index - @position )
      if lc = skipped.count( "\n" ) and lc.zero?
        @column += skipped.length
      else
        @line += lc
        @column = skipped.length - skipped.rindex( "\n" ) - 1
      end
    end
    @position = index
    return nil
  end

  #
  # customized object inspection that shows:
  # * the stream class
  # * the stream's location in <tt>index / line:column</tt> format
  # * +before_chars+ characters before the cursor (6 characters by default)
  # * +after_chars+ characters after the cursor (10 characters by default)
  #
  def inspect( before_chars = 6, after_chars = 10 )
    before = through( -before_chars ).inspect
    @position - before_chars > 0 and before.insert( 0, '... ' )

    after = through( after_chars ).inspect
    @position + after_chars + 1 < @data.length and after << ' ...'

    location = "#@position / line #@line:#@column"
    "#<#{ self.class }: #{ before } | #{ after } @ #{ location }>"
  end

  #
  # return the string slice between position +start+ and +stop+
  #
  def substring( start, stop )
    @string[ start, stop - start + 1 ]
  end

  #
  # identical to String#[]
  #
  def []( start, *args )
    @string[ start, *args ]
  end
end


=begin rdoc ANTLR3::FileStream

FileStream is a character stream that uses data stored in some external file. It
is nearly identical to StringStream and functions as use data located in a file
while automatically setting up the +source_name+ and +line+ parameters. It does
not actually use any buffered IO operations throughout the stream navigation
process. Instead, it reads the file data once when the stream is initialized.

=end

class FileStream < StringStream

  #
  # creates a new FileStream object using the given +file+ object.
  # If +file+ is a path string, the file will be read and the contents
  # will be used and the +name+ attribute will be set to the path.
  # If +file+ is an IO-like object (that responds to :read),
  # the content of the object will be used and the stream will
  # attempt to set its +name+ object first trying the method #name
  # on the object, then trying the method #path on the object.
  #
  # see StringStream.new for a list of additional options
  # the constructer accepts
  #
  def initialize( file, options = {} )
    case file
    when $stdin then
      data = $stdin.read
      @name = '(stdin)'
    when ARGF
      data = file.read
      @name = file.path
    when ::File then
      file = file.clone
      file.reopen( file.path, 'r' )
      @name = file.path
      data = file.read
      file.close
    else
      if file.respond_to?( :read )
        data = file.read
        if file.respond_to?( :name ) then @name = file.name
        elsif file.respond_to?( :path ) then @name = file.path
        end
      else
        @name = file.to_s
        if test( ?f, @name ) then data = File.read( @name )
        else raise ArgumentError, "could not find an existing file at %p" % @name
        end
      end
    end
    super( data, options )
  end

end

=begin rdoc ANTLR3::CommonTokenStream

CommonTokenStream serves as the primary token stream implementation for feeding
sequential token input into parsers.

Using some TokenSource (such as a lexer), the stream collects a token sequence,
setting the token's <tt>index</tt> attribute to indicate the token's position
within the stream. The streams may be tuned to some channel value; off-channel
tokens will be filtered out by the #peek, #look, and #consume methods.

=== Sample Usage


  source_input = ANTLR3::StringStream.new("35 * 4 - 1")
  lexer = Calculator::Lexer.new(source_input)
  tokens = ANTLR3::CommonTokenStream.new(lexer)

  # assume this grammar defines whitespace as tokens on channel HIDDEN
  # and numbers and operations as tokens on channel DEFAULT
  tokens.look         # => 0 INT['35'] @ line 1 col 0 (0..1)
  tokens.look(2)      # => 2 MULT["*"] @ line 1 col 2 (3..3)
  tokens.tokens(0, 2)
    # => [0 INT["35"] @line 1 col 0 (0..1),
    #     1 WS[" "] @line 1 col 2 (1..1),
    #     2 MULT["*"] @ line 1 col 3 (3..3)]
    # notice the #tokens method does not filter off-channel tokens

  lexer.reset
  hidden_tokens =
    ANTLR3::CommonTokenStream.new(lexer, :channel => ANTLR3::HIDDEN)
  hidden_tokens.look # => 1 WS[' '] @ line 1 col 2 (1..1)

=end

class CommonTokenStream
  include TokenStream
  include Enumerable

  #
  # constructs a new token stream using the +token_source+ provided. +token_source+ is
  # usually a lexer, but can be any object that implements +next_token+ and includes
  # ANTLR3::TokenSource.
  #
  # If a block is provided, each token harvested will be yielded and if the block
  # returns a +nil+ or +false+ value, the token will not be added to the stream --
  # it will be discarded.
  #
  # === Options
  # [:channel] The channel value the stream should be tuned to initially
  # [:source_name] The source name (file name) attribute of the stream
  #
  # === Example
  #
  #   # create a new token stream that is tuned to channel :comment, and
  #   # discard all WHITE_SPACE tokens
  #   ANTLR3::CommonTokenStream.new(lexer, :channel => :comment) do |token|
  #     token.name != 'WHITE_SPACE'
  #   end
  #
  def initialize( token_source, options = {} )
    case token_source
    when CommonTokenStream
      # this is useful in cases where you want to convert a CommonTokenStream
      # to a RewriteTokenStream or other variation of the standard token stream
      stream = token_source
      @token_source = stream.token_source
      @channel = options.fetch( :channel ) { stream.channel or DEFAULT_CHANNEL }
      @source_name = options.fetch( :source_name ) { stream.source_name }
      tokens = stream.tokens.map { | t | t.dup }
    else
      @token_source = token_source
      @channel = options.fetch( :channel, DEFAULT_CHANNEL )
      @source_name = options.fetch( :source_name ) {  @token_source.source_name rescue nil }
      tokens = @token_source.to_a
    end
    @last_marker = nil
    @tokens = block_given? ? tokens.select { | t | yield( t, self ) } : tokens
    @tokens.each_with_index { |t, i| t.index = i }
    @position =
      if first_token = @tokens.find { |t| t.channel == @channel }
        @tokens.index( first_token )
      else @tokens.length
      end
  end

  #
  # resets the token stream and rebuilds it with a potentially new token source.
  # If no +token_source+ value is provided, the stream will attempt to reset the
  # current +token_source+ by calling +reset+ on the object. The stream will
  # then clear the token buffer and attempt to harvest new tokens. Identical in
  # behavior to CommonTokenStream.new, if a block is provided, tokens will be
  # yielded and discarded if the block returns a +false+ or +nil+ value.
  #
  def rebuild( token_source = nil )
    if token_source.nil?
      @token_source.reset rescue nil
    else @token_source = token_source
    end
    @tokens = block_given? ? @token_source.select { |token| yield( token ) } :
                             @token_source.to_a
    @tokens.each_with_index { |t, i| t.index = i }
    @last_marker = nil
    @position =
      if first_token = @tokens.find { |t| t.channel == @channel }
        @tokens.index( first_token )
      else @tokens.length
      end
    return self
  end

  #
  # tune the stream to a new channel value
  #
  def tune_to( channel )
    @channel = channel
  end

  def token_class
    @token_source.token_class
  rescue NoMethodError
    @position == -1 and fill_buffer
    @tokens.empty? ? CommonToken : @tokens.first.class
  end

  alias index position

  def size
    @tokens.length
  end

  alias length size

  ###### State-Control ################################################

  #
  # rewind the stream to its initial state
  #
  def reset
    @position = 0
    @position += 1 while token = @tokens[ @position ] and
                         token.channel != @channel
    @last_marker = nil
    return self
  end

  #
  # bookmark the current position of the input stream
  #
  def mark
    @last_marker = @position
  end

  def release( marker = nil )
    # do nothing
  end


  def rewind( marker = @last_marker, release = true )
    seek( marker )
  end

  #
  # saves the current stream position, yields to the block,
  # and then ensures the stream's position is restored before
  # returning the value of the block
  #
  def hold( pos = @position )
    block_given? or return enum_for( :hold, pos )
    begin
      yield
    ensure
      seek( pos )
    end
  end

  ###### Stream Navigation ###########################################

  #
  # advance the stream one step to the next on-channel token
  #
  def consume
    token = @tokens[ @position ] || EOF_TOKEN
    if @position < @tokens.length
      @position = future?( 2 ) || @tokens.length
    end
    return( token )
  end

  #
  # jump to the stream position specified by +index+
  # note: seek does not check whether or not the
  #       token at the specified position is on-channel,
  #
  def seek( index )
    @position = index.to_i.bound( 0, @tokens.length )
    return self
  end

  #
  # return the type of the on-channel token at look-ahead distance +k+. <tt>k = 1</tt> represents
  # the current token. +k+ greater than 1 represents upcoming on-channel tokens. A negative
  # value of +k+ returns previous on-channel tokens consumed, where <tt>k = -1</tt> is the last
  # on-channel token consumed. <tt>k = 0</tt> has undefined behavior and returns +nil+
  #
  def peek( k = 1 )
    tk = look( k ) and return( tk.type )
  end

  #
  # operates simillarly to #peek, but returns the full token object at look-ahead position +k+
  #
  def look( k = 1 )
    index = future?( k ) or return nil
    @tokens.fetch( index, EOF_TOKEN )
  end

  alias >> look
  def << k
    self >> -k
  end

  #
  # returns the index of the on-channel token at look-ahead position +k+ or nil if no other
  # on-channel tokens exist
  #
  def future?( k = 1 )
    @position == -1 and fill_buffer

    case
    when k == 0 then nil
    when k < 0 then past?( -k )
    when k == 1 then @position
    else
      # since the stream only yields on-channel
      # tokens, the stream can't just go to the
      # next position, but rather must skip
      # over off-channel tokens
      ( k - 1 ).times.inject( @position ) do |cursor, |
        begin
          tk = @tokens.at( cursor += 1 ) or return( cursor )
          # ^- if tk is nil (i.e. i is outside array limits)
        end until tk.channel == @channel
        cursor
      end
    end
  end

  #
  # returns the index of the on-channel token at look-behind position +k+ or nil if no other
  # on-channel tokens exist before the current token
  #
  def past?( k = 1 )
    @position == -1 and fill_buffer

    case
    when k == 0 then nil
    when @position - k < 0 then nil
    else

      k.times.inject( @position ) do |cursor, |
        begin
          cursor <= 0 and return( nil )
          tk = @tokens.at( cursor -= 1 ) or return( nil )
        end until tk.channel == @channel
        cursor
      end

    end
  end

  #
  # yields each token in the stream (including off-channel tokens)
  # If no block is provided, the method returns an Enumerator object.
  # #each accepts the same arguments as #tokens
  #
  def each( *args )
    block_given? or return enum_for( :each, *args )
    tokens( *args ).each { |token| yield( token ) }
  end


  #
  # yields each token in the stream with the given channel value
  # If no channel value is given, the stream's tuned channel value will be used.
  # If no block is given, an enumerator will be returned.
  #
  def each_on_channel( channel = @channel )
    block_given? or return enum_for( :each_on_channel, channel )
    for token in @tokens
      token.channel == channel and yield( token )
    end
  end

  #
  # iterates through the token stream, yielding each on channel token along the way.
  # After iteration has completed, the stream's position will be restored to where
  # it was before #walk was called. While #each or #each_on_channel does not change
  # the positions stream during iteration, #walk advances through the stream. This
  # makes it possible to look ahead and behind the current token during iteration.
  # If no block is given, an enumerator will be returned.
  #
  def walk
    block_given? or return enum_for( :walk )
    initial_position = @position
    begin
      while token = look and token.type != EOF
        consume
        yield( token )
      end
      return self
    ensure
      @position = initial_position
    end
  end

  #
  # returns a copy of the token buffer. If +start+ and +stop+ are provided, tokens
  # returns a slice of the token buffer from <tt>start..stop</tt>. The parameters
  # are converted to integers with their <tt>to_i</tt> methods, and thus tokens
  # can be provided to specify start and stop. If a block is provided, tokens are
  # yielded and filtered out of the return array if the block returns a +false+
  # or +nil+ value.
  #
  def tokens( start = nil, stop = nil )
    stop.nil?  || stop >= @tokens.length and stop = @tokens.length - 1
    start.nil? || stop < 0 and start = 0
    tokens = @tokens[ start..stop ]

    if block_given?
      tokens.delete_if { |t| not yield( t ) }
    end

    return( tokens )
  end


  def at( i )
    @tokens.at i
  end

  #
  # identical to Array#[], as applied to the stream's token buffer
  #
  def []( i, *args )
    @tokens[ i, *args ]
  end

  ###### Standard Conversion Methods ###############################
  def inspect
    string = "#<%p: @token_source=%p @ %p/%p" %
      [ self.class, @token_source.class, @position, @tokens.length ]
    tk = look( -1 ) and string << " #{ tk.inspect } <--"
    tk = look( 1 ) and string << " --> #{ tk.inspect }"
    string << '>'
  end

  #
  # fetches the text content of all tokens between +start+ and +stop+ and
  # joins the chunks into a single string
  #
  def extract_text( start = 0, stop = @tokens.length - 1 )
    start = start.to_i.at_least( 0 )
    stop = stop.to_i.at_most( @tokens.length )
    @tokens[ start..stop ].map! { |t| t.text }.join( '' )
  end

  alias to_s extract_text

end

end