Version: 2:0
json-parsing: JSON Parsing, Folding, and Conversion in Racket
(require (planet neil/json-parsing:2:0)) |
1 Introduction
The json-parsing package for Racket provides JSON parsing and format conversion
using a streaming tree fold. This tree fold approach permits processing JSON
input of arbitrary size in relatively small space for some applications, unlike
the common approach of parsing the entire input to an AST before processing the
AST.
The supported JSON format is as specified on http://json.org/, as viewed on 2010-12-25.
The format converters in package include a convertor to SJSON s-expression format. SJSON has been made to be fully compatible
with the jsexpr of Dave Herman’s PLaneT package dherman/json:3:=0.
The parser does not consume any characters not belonging to the
JSON value,and can be used to read multiple JSON values or to be intermixed
with other kinds of reading from the same input.
The tree fold approach of this package’s parser was inspired and
informed by Oleg Kiselyov’s SSAX XML parsing work.
Implementing the json-parsing package was originally intended as an exercise for getting more
experience with SSAX-like folding, before undertaking some new XML packages,
but the JSON work has turned out useful in its own right. A future version of
this package might also implement alternative tree fold approaches.
2 Exceptions
When the parser encounters invalid JSON, it raises an exn:fail:invalid-json exception. While this exception will be caught by
handlers such as exn:fail?, the distinct exception type permits
JSON-parsing errors to be handled separately from other errors, and it also
includes some location information.
Type predicate.
(exn:fail:invalid-json-location exn)
→
(list/c (or/c exact-positive-integer? #f) (or/c exact-nonnegative-integer? #f) (or/c exact-positive-integer? #f)) exn : exn:fail:invalid-json?
Gets information on the location of the error within the input
stream. Currently, this is a list of three elements, of the three values
returned by Racket’s port-next-location procedure.
3 Parse Fold
Special syntax that expands to a JSON parser procedure. Normally
you would use this if you were defining a new application of what processing
the parser should do while it is parsing JSON. The resulting procedure of this
syntax has the arguments:
(in seed exhaust?)
where in is an input port or string, seed is a seed value,and exhaust? is whether or not to exhaustively consume all input and ensure
that there is no other non-JSON-whitespace.
json-fold-lambda has many arguments, all of which must be present. Here is an
example of how you might define a my-json-to-sjson procedure using json-fold-lambda:
(define my-json-to-sjson (json-fold-lambda #:error-name 'my-json-to-sjson #:visit-object-start (lambda (seed) (make-hasheq)) #:visit-object-end (lambda (seed parent-seed) `(,seed ,@parent-seed)) #:visit-member-start (lambda (name seed) '()) #:visit-member-end (lambda (name seed parent-seed) (hash-set! parent-seed (string->symbol name) (car seed)) parent-seed) #:visit-array-start (lambda (seed) '()) #:visit-array-end (lambda (seed parent-seed) `(,(reverse seed) ,@parent-seed)) #:visit-string (lambda (str seed) `(,str ,@seed)) #:visit-number (lambda (num seed) `(,num ,@seed)) #:visit-constant (lambda (name seed) `(,(case name ((true) #t) ((false) #f) ((null) #\nul) (else (error 'my-json-to-sjson "invalid constant ~S" name))) ,@seed))))
As you can see, the arguments provide a set of procedures that
are applied at various states in the parsing. Each of these callback
procedures accepts at least one seed value from its preceding sibling and/or
parent, and it produces a seed value for the next sibling, child, or parent.
The concepts object, member, and array are non-leaf nodes in the tree. The start callback for each non-leaf node receives a seed from its
preceding sibling, and the value it produces is the seed for its first child.
The end callback receives both the seed from the last child, and the
parent seed (the sibling predecessor seed of the node; the same seed received
by the corresponding start).
The leaf nodes each simply receive a seed from the sibling
predecessor callback (or, if the first sibling, from the parent start; or, if the first callback, from the seed provided to the parser
call), and provide one to the sibling successor (or, if the last sibling, to
the parent end; or, if the last callback, to the result of the parser call).
Note that two different techniques are used above to build
collections of objects during processing, using seeds. The first is to use a
hash that is passed in the seed, which in this case is used because SJSON
requires a hash as part of its format. The second, and more common, is to
construct lists by incrementally consing onto the front of the list, so that
the list is ordred backwards, and waiting til the list is finished to put it in
correct order using the reverse procedure.
The parser procedure returns either the value of the last
callback, or, if the end of the input is reached without a JSON value, the eof-object.
(make-json-fold [ #:error-name error-name] #:visit-object-start visit-object-start #:visit-object-end visit-object-end #:visit-member-start visit-member-start #:visit-member-end visit-member-end #:visit-array-start visit-array-start #:visit-array-end visit-array-end #:visit-string visit-string #:visit-number visit-number #:visit-constant visit-constant)
→
(->* ((or/c input-port? string?) any/c) (#:exhaust? boolean?) any) error-name : symbol? = '<make-json-fold> visit-object-start : (-> any/c any/c) visit-object-end : (-> any/c any/c any/c) visit-member-start : (-> symbol? any/c any/c) visit-member-end : (-> symbol? any/c any/c any/c) visit-array-start : (-> any/c any/c) visit-array-end : (-> any/c any/c any/c) visit-string : (-> string? any/c any/c) visit-number : (-> number? any/c any/c) visit-constant : (-> symbol? any/c any/c)
This is like json-fold-lambda, except it is a procedure, rather than syntax. make-json-fold can be used in the less-common case that you need to define a
new parser dynamically.
Note that, in the produced procedure, the exhaust? argument is optional (defaulting to #t).
4 Conversion
4.1 Conversion to JSON
(json-to-sjson-visit-object-start seed) → any/c seed : any/c
(json-to-sjson-visit-object-end seed parent-seed) → any/c seed : any/c parent-seed : any/c
(json-to-sjson-visit-member-start name seed) → any/c name : symbol? seed : any/c
(json-to-sjson-visit-member-end name seed parent-seed) → any/c name : symbol? seed : any/c parent-seed : any/c (json-to-sjson-visit-array-start seed) → any/c seed : any/c
(json-to-sjson-visit-array-end seed parent-seed) → any/c seed : any/c parent-seed : any/c (json-to-sjson-visit-string str seed) → any/c str : string? seed : any/c (json-to-sjson-visit-number num seed) → any/c num : number? seed : any/c (json-to-sjson-visit-constant name seed) → any/c name : symbol? seed : any/c
Fold visitor procedures used by json->sjson. May also be used by other fold definitions.
(json->sjson in [#:exhaust? exhaust?]) → sjson? in : (or/c input-port? string?) exhaust? : boolean? = #t
Parse a JSON value from input port or string in, and return an SJSON parsed representation.
4.2 Conversion to SXML
(json->sxml in [#:exhaust? exhaust?]) → sxml/xexp? in : (or/c input-port? string?) exhaust? : boolean? = #t
Parse the JSON input from input port or string in, and return in a contrived XML data format that can be processed
with various SXML tools.
4.3 Conversion to XML
(write-json-as-xml in [ #:exhaust? exhaust? #:out out]) → void? in : (or/c input-port? string?) exhaust? : boolean? = #t out : output-port? = (current-output-port)
Parse the JSON input from input port or string in, and write it in contrived XML data format to output port out (which defaults to the value of the current-output-port parameter). This is mainly a demonstration of “streaming”
processing that can scale to arbitrary JSON input sizes.
(json->xml in [#:exhaust? exhaust?]) → string? in : (or/c input-port? string?) exhaust? : boolean? = #t
This is like write-json-as-xml, but instead of writing to a port, it returns the XML as a string.
Most people would not choose to do this.
5 History
- PLaneT 2:0 —
2012-06-13 Converted to McFly and Overeasy. - Version 0.3 —
PLaneT 1:2 — 2011-08-22 Added json-to-sjson-visit- procedures. Documentation fix. - Version 0.2 —
PLaneT 1:1 — 2010-12-27 Added missing export. - Version 0.1 —
PLaneT 1:0 — 2010-12-26 Initial release.
6 Legal
Copyright 2010 – 2012 Neil Van Dyke. This program is Free Software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See http://www.gnu.org/licenses/ for details. For other licenses and consulting, please contact the author.