#lang scribble/doc @; THIS FILE IS GENERATED @(require scribble/manual) @(require (for-label (planet neil/json-parsing:1:=1))) @title[#:version "0.2"]{@bold{json-parsing}: JSON Parsing, Folding, and Conversion for Racket/Scheme} @author{Neil Van Dyke} License: @seclink["Legal" #:underline? #f]{LGPL 3} @(hspace 1) Web: @link["http://www.neilvandyke.org/racket-json-parsing/" #:underline? #f]{http://www.neilvandyke.org/racket-json-parsing/} @defmodule[(planet neil/json-parsing:1:=1)] @section{Introduction} The @bold{json-parsing} package for Racket provides JSON parsing and format conversion using a streaming tree fold. This tree fold approach permits processing JSON input of arbitrary size in relatively small space for some applications, unlike the common approach of parsing the entire input to an AST before processing the AST. The supported JSON format is as specified on @link["http://json.org/"]{http://json.org/}, as viewed on 2010-12-25. The format converters in package include a convertor to SJSON s-expression format. SJSON has been made to be fully compatible with the @emph{jsexpr} of Dave Herman's PLaneT package @tt{dherman/json:3:=0}. The parser does not consume any characters not belonging to the JSON value, and can be used to read multiple JSON values or to be intermixed with other kinds of reading from the same input. The tree fold approach of this package's parser was inspired and informed by Oleg Kiselyov's @link["http://okmij.org/ftp/Scheme/xml.html"]{SSAX} XML parsing work. Implementing the @bold{json-parsing} package was originally intended as an exercise for getting more experience with SSAX-like folding, before undertaking some new XML packages, but the JSON work has turned out useful in its own right. A future version of this package might also implement alternative tree fold approaches. @section{Exceptions} When the parser encounters invalid JSON, it raises an @tt{exn:fail:invalid-json} exception. While this exception will be caught by handlers such as @tt{exn:fail?}, the distinct exception type permits JSON-parsing errors to be handled separately from other errors, and it also includes some location information. @defthing[exn:fail:invalid-json? any/c]{ Type predicate. } @defproc[ (exn:fail:invalid-json-location (exn:fail:invalid-json any/c)) any/c]{ Gets information on the location of the error within the input stream. Currently, this is a list of three elements, of the three values returned by Racket's @tt{port-next-location} procedure. } @section{Parse Fold} @defform[#:id json-fold-lambda (json-fold-lambda ...)]{ Special syntax that expands to a JSON parser procedure. Normally you would use this if you were defining a new application of what processing the parser should do while it is parsing JSON. The resulting procedure of this syntax the arguments: @verbatim["( in seed exhaust? )"] where @tt{in} is an input port or string, @tt{seed} is a seed value, and @tt{exhaust?} is whether or not to exhaustively consume all input and ensure that there is no other non-JSON-whitespace. @tt{json-fold-lambda} has many arguments, all of which must be present. Here is an example of how you might define a @tt{my-json-to-sjson} procedure using @tt{json-fold-lambda}: @SCHEMEBLOCK[ (define my-json-to-sjson (json-fold-lambda #:error-name 'my-json-to-sjson #:visit-object-start (lambda (seed) (make-hasheq)) #:visit-object-end (lambda (seed parent-seed) `(,seed ,\@parent-seed)) #:visit-member-start (lambda (name seed) '()) #:visit-member-end (lambda (name seed parent-seed) (hash-set! parent-seed (string->symbol name) (car seed)) parent-seed) #:visit-array-start (lambda (seed) '()) #:visit-array-end (lambda (seed parent-seed) `(,(reverse seed) ,\@parent-seed)) #:visit-string (lambda (str seed) `(,str ,\@seed)) #:visit-number (lambda (num seed) `(,num ,\@seed)) #:visit-constant (lambda (name seed) `(,(case name ((true) #t) ((false) #f) ((null) #\null) (else (error 'my-json-to-sjson "invalid constant ~S" name))) ,\@seed)))) ] As you can see, the arguments provide a set of procedures that are applied at various states in the parsing. Each of these callback procedures accepts at least one seed value from its preceding sibling and/or parent, and it produces a seed value for the next sibling, child, or parent. The concepts @emph{object}, @emph{member}, and @emph{array} are non-leaf notes in the tree. The @emph{start} callback for each non-leaf node receives a seed from its preceding sibling, and the value it produces is the seed for its first child. The @emph{end} callback receives both the seed from the last child, and the parent seed (the sibling predecessor seed of the node; the same seed received by the corresponding @emph{start}). The leaf nodes each simply receive a seed from the sibling predecessor callback (or, if the first sibling, from the parent @emph{start}; or, if the first callback, from the seed provided to the parser call), and provide one to the sibling successor (or, if the last sibling, to the parent @emph{end}; or, if the last callback, to the result of the parser call). Note that two different techniques are used above to build collections of objects during processing, using seeds. The first is to use a hash that is passed in the seed, which in this case is used because SJSON requires a hash as part of its format. The second, and more common, is to construct lists by incrementally consing onto the front of the list, so that the list is ordred backwards, and waiting til the list is finished to put it in correct order using the @tt{reverse} procedure. The parser procedure returns either the value of the last callback, or, if the end of the input is reached without a JSON value, the @tt{eof} object. } @defproc[ (make-json-fold (... any/c)) any/c]{ This is like @tt{json-fold-lambda}, except it is a procedure, rather than syntax. @tt{make-json-fold} can be used in the less-common case that you need to define a new parser dynamically. Note that, in the produced procedure, the @tt{exhaust?} argument is optional (defaulting to @tt{#t}). Thus, the signature is: @verbatim["( in seed { #:exhaust? exhaust? }? )"] } @section{Conversion} @defproc[ (json->sjson (in any/c) (#:exhaust? exhaust? any/c)) any/c]{ Parse a JSON value from input port or string @schemevarfont{in}, and return an SJSON parsed representation. SJSON is identical to the @emph{jsexpr} defined by the PLaneT package @tt{dherman/json:3:=0}. } @defproc[ (json->sxml (in any/c) (#:exhaust? exhaust? any/c)) any/c]{ Parse the JSON input from input port or string @schemevarfont{in}, and return in a contrived XML data format that can be processed with various SXML tools. } @defproc[ (write-json-as-xml (in any/c) (#:exhaust? exhaust? any/c) (#:out out any/c)) any/c]{ Parse the JSON input from input port or string @schemevarfont{in}, and write it in contrived XML data format to output port @schemevarfont{out} (which defaults to the value of the @tt{current-output-port} parameter). This is mainly a demonstration of ``streaming'' processing that can scale to arbitrary JSON input sizes. } @defproc[ (json->xml (in any/c) (#:exhaust? exhaust? any/c)) any/c]{ This is like @tt{write-json-as-xml}, but instead of writing to a port, it returns the XML as a string. Most people would not choose to do this. } @section{History} @itemize[ @item{Version 0.2 --- 2010-12-27 - PLaneT @tt{(1 1)} Added missing export. } @item{Version 0.1 --- 2010-12-26 - PLaneT @tt{(1 0)} Initial release. } ] @section[#:tag "Legal"]{Legal} Copyright (c) 2010 Neil Van Dyke. This program is Free Software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License (LGPL 3), or (at your option) any later version. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See http://www.gnu.org/licenses/ for details. For other licenses and consulting, please contact the author. @italic{@smaller{Standard Documentation Format Note: The API signatures in this documentation are likely incorrect in some regards, such as indicating type @tt{any/c} for things that are not, and not indicating when arguments are optional. This is due to a transitioning from the Texinfo documentation format to Scribble, which the author intends to finish someday.}}