4 Lexing and Parsing
This library provides facilities for parsing Datalog source. It can be required via:
(require (planet jaymccarthy/datalog:1:3/parse)) |
4.1 Datalog Syntax
In Datalog input, whitespace characters are ignored except when they separate adjacent tokens or when they occur in strings. Comments are also considered to be whitespace. The character % introduces a comment, which extends to the next line break. Comments do not occur inside strings.
A variable is a sequence of Latin capital and small letters, digits, and the underscore character. A variable must begin with a Latin capital letter.
An identifier is a sequence of printing characters that does not contain any of the following characters: (, `, ', ), =, :, ., ~, ?, ", %, and space. An identifier must not begin with a Latin capital letter. Note that the characters that start punctuation are forbidden in identifiers, but the hyphen character is allowed.
A string is a sequence of characters enclosed in double quotes. Characters other than double quote, newline, and backslash may be directly included in a string. The remaining characters may be specified using escape characters, \", \ , and \\ respectively.
parent(john, douglas) |
zero-arity-literal |
"="(3,3) |
""(-0-0-0,&&&,***,"\00") |
A clause is a head literal followed by an optional body. A body is a comma separated list of literals. A clause without a body is called a fact, and a rule when it has one. The punctuation :- separates the head of a rule from its body. A clause is safe if every variable in its head occurs in some literal in its body. The following are safe clauses:
parent(john, douglas) |
ancestor(A, B) :- |
parent(A, B) |
ancestor(A, B) :- |
parent(A, C), |
ancestor(C, B) |
A program is a sequence of zero or more statements. A statement is an assertion, a retraction, or a query. An assertion is a clause followed by a period, and it adds the clause to the database if it is safe. A retraction is a clause followed by a tilde, and it removes the clause from the database. A query is a literal followed by a question mark. The effect of reading a Datalog program is to modify the database as directed by its statements, and then to return the literal designated as the query. If no query is specified, a reader returns a literal know to have no answers. The following is a program:
edge(a, b). edge(b, c). edge(c, d). edge(d, a). |
path(X, Y) :- edge(X, Y). |
path(X, Y) :- edge(X, Z), path(Z, Y). |
path(X, Y)? |
The following BNF describes the syntax of Datalog.
| ‹program› | ::= | ‹statement›* |
| ‹statement› | ::= | ‹assertion› |
|
| | | ‹retraction› |
|
| | | ‹query› |
| ‹assertion› | ::= | ‹clause› . |
| ‹retraction› | ::= | ‹clause› ~ |
| ‹query› | ::= | ‹literal› ? |
| ‹clause› | ::= | ‹literal› :- ‹body› |
|
| | | ‹literal› |
| ‹body› | ::= | ‹literal› , ‹body› |
|
| | | ‹literal› |
| ‹literal› | ::= | ‹predicate-sym› ( ) |
|
| | | ‹predicate-sym› ( ‹terms› ) |
|
| | | ‹predicate-sym› |
|
| | | ‹term› = ‹term› |
| ‹predicate-sym› | ::= | ‹IDENTIFIER› |
|
| | | ‹STRING› |
| ‹terms› | ::= | ‹term› |
|
| | | ‹term› , ‹terms› |
| ‹term› | ::= | ‹VARIABLE› |
|
| | | ‹constant› |
| ‹constant› | ::= | ‹IDENTIFIER› |
|
| | | ‹STRING› |
4.2 Parser API
(parse-literal ip) → literal? |
ip : input-port? |
Examples: | ||
> (parse-literal (open-input-string "parent(john,douglas)")) | ||
`#s(literal (string 1 0 1 20) parent (#s(constant (string 1 7 8 4) john) #s(constant (string 1 12 13 7) douglas))) | ||
> (parse-literal (open-input-string "zero-arity-literal")) | ||
`#s(literal (string 1 0 1 18) zero-arity-literal ()) | ||
> (parse-literal (open-input-string "\"=\"(3,3)")) | ||
`#s(literal (string 1 0 1 8) "=" (#s(constant (string 1 4 5 1) |3|) #s(constant (string 1 6 7 1) |3|))) | ||
| ||
`#s(literal (string 1 0 1 22) "" (#s(constant (string 1 3 4 6) -0-0-0) #s(constant (string 1 10 11 3) &&&) #s(constant (string 1 14 15 3) ***) #s(constant (string 1 18 19 1) "\u0000"))) | ||
> (parse-literal (open-input-string "3 = 3")) | ||
`#s(literal (string 1 0 1 5) = (#s(constant (string 1 0 1 1) |3|) #s(constant (string 1 4 5 1) |3|))) |
(parse-clause ip) → clause? |
ip : input-port? |
Examples: | ||||
| ||||
`#s(clause (string 1 0 1 21) #s(literal (string 1 0 1 21) parent (#s(constant (string 1 7 8 4) john) #s(constant (string 1 13 14 7) douglas))) ()) | ||||
| ||||
`#s(clause (string 1 0 1 30) #s(literal (string 1 0 1 14) ancestor (#s(variable (string 1 9 10 1) A) #s(variable (string 1 12 13 1) B))) (#s(literal (string 1 18 19 12) parent (#s(variable (string 1 25 26 1) A) #s(variable (string 1 28 29 1) B))))) | ||||
| ||||
`#s(clause (string 1 0 1 45) #s(literal (string 1 0 1 14) ancestor (#s(variable (string 1 9 10 1) A) #s(variable (string 1 12 13 1) B))) (#s(literal (string 1 18 19 12) parent (#s(variable (string 1 25 26 1) A) #s(variable (string 1 28 29 1) C))) #s(literal (string 1 31 32 14) ancestor (#s(variable (string 1 40 41 1) C) #s(variable (string 1 43 44 1) B))))) |
(parse-statement ip) → statement/c |
ip : input-port? |
Examples: | ||
| ||
`#s(assertion (string 1 0 1 22) #s(clause (string 1 0 1 21) #s(literal (string 1 0 1 21) parent (#s(constant (string 1 7 8 4) john) #s(constant (string 1 13 14 7) douglas))) ())) | ||
| ||
`#s(retraction (string 1 0 1 22) #s(clause (string 1 0 1 21) #s(literal (string 1 0 1 21) parent (#s(constant (string 1 7 8 4) john) #s(constant (string 1 13 14 7) douglas))) ())) | ||
| ||
`#s(query (string 1 0 1 22) #s(literal (string 1 0 1 21) parent (#s(constant (string 1 7 8 4) john) #s(constant (string 1 13 14 7) douglas)))) |
(parse-program ip) → program/c |
ip : input-port? |
Example: | ||||||||
| ||||||||
`(#s(assertion (string 1 0 1 11) #s(clause (string 1 0 1 10) #s(literal (string 1 0 1 10) edge (#s(constant (string 1 5 6 1) a) #s(constant (string 1 8 9 1) b))) ())) #s(assertion (string 1 12 13 11) #s(clause (string 1 12 13 10) #s(literal (string 1 12 13 10) edge (#s(constant (string 1 17 18 1) b) #s(constant (string 1 20 21 1) c))) ())) #s(assertion (string 1 23 24 11) #s(clause (string 1 23 24 10) #s(literal (string 1 23 24 10) edge (#s(constant (string 1 28 29 1) c) #s(constant (string 1 31 32 1) d))) ())) #s(assertion (string 1 35 36 11) #s(clause (string 1 35 36 10) #s(literal (string 1 35 36 10) edge (#s(constant (string 1 40 41 1) d) #s(constant (string 1 43 44 1) a))) ())) #s(assertion (string 1 46 47 25) #s(clause (string 1 46 47 24) #s(literal (string 1 46 47 10) path (#s(variable (string 1 51 52 1) X) #s(variable (string 1 54 55 1) Y))) (#s(literal (string 1 60 61 10) edge (#s(variable (string 1 65 66 1) X) #s(variable (string 1 68 69 1) Y)))))) #s(assertion (string 1 71 72 37) #s(clause (string 1 71 72 36) #s(literal (string 1 71 72 10) path (#s(variable (string 1 76 77 1) X) #s(variable (string 1 79 80 1) Y))) (#s(literal (string 1 85 86 10) edge (#s(variable (string 1 90 91 1) X) #s(variable (string 1 93 94 1) Z))) #s(literal (string 1 97 98 10) path (#s(variable (string 1 102 103 1) Z) #s(variable (string 1 105 106 1) Y)))))) #s(query (string 1 108 109 11) #s(literal (string 1 108 109 10) path (#s(variable (string 1 113 114 1) X) #s(variable (string 1 116 117 1) Y))))) |