xml-match provides pattern matching of XML nodes. The pattern notation is based on that of Scheme's syntax-rules/syntax-case macro systems.
The grammar for the
xml-match syntax is given below:
]node-pattern ::= literal-pattern | pat-var-or-cata | element-pattern | list-pattern literal-pattern ::= string | character | number |
#fattribute-pattern ::= attribute-keyword attr-val-pattern attr-val-pattern ::= literal-pattern | pat-var-or-cata |
(tag-symbol attribute-pattern* nodeset-pattern
(tag-symbol attribute-pattern* nodeset-pattern?
(list)nodeset-pattern ::= node-pattern | node-pattern
...| node-pattern nodeset-pattern | node-pattern
...nodeset-pattern pat-var-or-cata ::=
Within a list or element body pattern, ellipses may appear only once, but may be followed by zero or more node patterns.
Guard expressions cannot refer to the return values of catamorphisms.
Ellipses in the output expressions must appear only in an expression context; ellipses are not allowed in a syntactic form.
The sections below illustrate specific aspects of the
xml-match pattern matcher.
The example below illustrates the pattern matching of an XML element:
(xml-match (e i: 1 3 4 5) [(e i: ,d ,a ,b ,c) (list d a b c)] [,otherwise #f])
The element and attribute tags used in patterns are the element and attribute constructors created by
define-attribute. Each clause in
xml-match contains two parts: a pattern and one or more expressions which are evaluated if the pattern is successfully match.
Pattern variables are must be "unquoted" in the pattern. The above expression binds
A nodeset pattern is designated by a list in the pattern, beginning the identifier
list. The example below illustrates matching a nodeset.
(xml-match '("i" "j" "k" "l" "m") [(list ,a ,b ,c ,d ,e) (list (h4:p a) (h4:p b) (h4:p c) (h4:p d) (h4:p e))])
This example wraps each nodeset item in an HTML paragraph element.
Matching the 'rest' of a nodeset is achieved by using a "
. rest)" pattern at the end of an element or nodeset pattern.
This is illustrated in the example below:
(xml-match (e 3 (f 4 5 6) 7) [(e ,a (f . ,y) ,d) (list a y d)])
The above expression returns
(3 (4 5 6) 7).
syntax-rules, ellipses may be used to specify a repeated pattern. Note that the pattern "
item ..." specifies zero-or-more matches of the pattern "
The use of ellipses in a pattern is illustrated in the code fragment below, where nested ellipses are used to match the children of repeated instances of an 'a' element, within an element 'd'.
(define x (d (a 1 2 3) (a 4 5) (a 6 7 8) (a 9 10))) (xml-match x [(d (a ,b ...) ...) (list (list b ...) ...)])
The above expression returns a value of
((1 2 3) (4 5) (6 7 8) (9 10)).
It is possible to specify a default value for an attribute which is used if the attribute is not present in the element being matched. This is illustrated in the following example:
(xml-match (e 3 4 5) [(e z: (,d 1) ,a ,b ,c) (list d a b c)])
1 is used when the attribute 'z' is absent from the element 'e'.
Guards may be added to a pattern clause via the
guard keyword. A guard expression may include zero or more expressions which are evaluated only if the pattern is matched. The body of the clause is only evaluated if the guard expressions evaluate to
The use of guard expressions is illustrated below:
(xml-match '(a 2 3) ((a ,n) (guard (number? n)) n) ((a ,m ,n) (guard (number? m) (number? n)) (+ m n)))
The example below illustrates the use of explicit recursion within an xml-match form. This example implements a simple calculator for the basic arithmetic operations, which are represented by the XML elements plus, minus, times, and div.
(define simple-eval (lambda (x) (xml-match x [,i (guard (integer? i)) i] [(plus ,x ,y) (+ (simple-eval x) (simple-eval y))] [(times ,x ,y) (* (simple-eval x) (simple-eval y))] [(minus ,x ,y) (- (simple-eval x) (simple-eval y))] [(div ,x ,y) (/ (simple-eval x) (simple-eval y))] [,otherwise (error "simple-eval: invalid expression" x)])))
Using the catamorphism feature of
xml-match , a more concise version of
simple-eval can be written. The pattern
,[x] recusively invokes the pattern matcher on the value bound in this position.
(define simple-eval (lambda (x) (xml-match x [,i (guard (integer? i)) i] [(plus ,[x] ,[y]) (+ x y)] [(times ,[x] ,[y]) (* x y)] [(minus ,[x] ,[y]) (- x y)] [(div ,[x] ,[y]) (/ x y)] [,otherwise (error "simple-eval: invalid expression" x)])))
It is also possible to explicitly name the operator in the 'cata' position. Where
,[id*] recurs to the top of the current
,[cata -> id*] recurs to
cata must evaluate to a procedure which takes one argument, and returns as many values as there are identifiers following
Named catamorphism patterns allow processing to be split into multiple, mutually recursive procedures. This is illustrated in the example below: a transformation that formats a "TV Guide" into HTML.
(define (tv-guide->html g) (define (cast-list cl) (xml-match cl [(CastList (CastMember (Character (Name ,ch)) (Actor (Name ,a))) ...) (h4:div (h4:ul (h4:li ch ": " a) ...))])) (define (prog p) (xml-match p [(Program (Start ,start-time) (Duration ,dur) (Series ,series-title) (Description . ,desc)) (h4:div (h4:p start-time (h4:br) series-title (h4:br) desc))] [(Program (Start ,start-time) (Duration ,dur) (Series ,series-title) (Description . ,desc) ,[cast-list -> cl]) (h4:div (h4:p start-time (h4:br) series-title (h4:br) desc) cl)])) (xml-match g [(TVGuide start: ,start-date end: ,end-date (Channel (Name ,nm) ,[prog -> p] ...) ...) (h4:html (h4:head (h4:title "TV Guide")) (h4:body (h4:h1 "TV Guide") (h4:div (h4:h2 nm) p ...) ...))]))
Within the body of an
xml-match form, a slightly extended version of
quasiquote is provided, which allows the use of ellipses. This is illustrated in the example below.
(xml-match '(e 3 4 5 6 7) [(e ,i ... 6 7) `("start" ,(list 'wrap i) ... "end")] [,otherwise #f])
The general pattern is that
`(something ,i ...) is rewritten as
xml-match-let* forms generalize the
let* forms of Scheme to allow an XML pattern in the binding position, rather than a simple variable.
For example, the expression below:
(xml-match-let ([(a ,i ,j) '(a 1 2)]) (+ i j))
binds the variables
j to 1 and 2 in the XML value given.
The syntax for these forms is given below:
(xml-match-let ([pat expr] ...) expression0 expression ...) (xml-match-let* ([pat expr] ...) expression0 expression ...)