Just came across Sam's REXML compatible XML parser based on Expat which had my brain thinking for a bit.
The interesting thing about this XML parser (other than it implements the REXML interfaces with an Expat implementation) is that it's pull based and uses continuation's to parse files in a really effective manner.
The source is here which is probably worth a glance before reading on.
The part that made everything click for me was understanding the first line of the initialize method, which sets everything up:
def initialize xml
callcc { |@sax_context| return }
....snip....
Here, callcc creates a continuation, storing it in the instance local variable, @sax_content, and returns, leaving the parser ready to go.
Then, the caller of this parser invokes the pull method to get an XML token:
def pull
callcc { |@pull_context| @sax_context.call }
end
which creates another continuation in the instance local @pull_context, saving the execution frame within pull, and calls upon the continuation stored previously in the @sax_content local to "continue" executing.
This takes execution back to the line after the 'callcc' in the initialize method, which opens the file and enters a loop to parse the XML, and reacts to finding a token by 'pushing' it back to the pull method by calling upon the saved @pull_context continuation with the token's value:
def push *value
callcc { |@sax_context| @pull_context.call value }
end
Here push creates a new @sax_context continuation (essentially freezing the parse of the file until another pull is invoked, but also saving where the parsing was up to), and calls upon the previously saved @pull_context continuation to "continue" executing now that we have a token to return. In Ruby the return value of the last line of code in a method is the return value of the method so the pull method above "continues" and returns the token to the caller.
The parser has now returned the first token to the caller and is essentially waiting to be told to get the next one.
Once another invocation of pull is made, the @sax_content continuation continues (from where it was last created, which is currently inside the token parsing loop, after the first token in the XML), and the process of pushing the next read token back to the caller of pull starts again, all the way until the file has ended.
It's quite a neat way of demonstrating the use of continuations with so little code.
Some interesting background reading is Sam's 'Continuations for Curmugdeons' post which can make the concepts of continuations a bit easier to understand, or the Cocoon web application framework which uses continuations to implement flow between pages in web applications.
Posted by crafterm at July 31, 2006 10:58 AM | TrackBack