-------------------------------------------------------------------------------- _DATA STRUCTURES GALORE version 3_ Written by: Carl Eastlund, Jens Axel S�gaard, and Stevie Strickland Documentation by: Carl Eastlund Documentation updated: May 5, 2006 -------------------------------------------------------------------------------- ------------------------------------------------------------ _Galore Chapter 0: Introduction_ ------------------------------------------------------------ This library implements commonly used immutable data structures. Currently Galore provides: _Sets_ (collections of distinct elements) _Tables_ (finite mappings from distinct keys to arbitrary values) (Other data structures provided by previous versions of Galore will be added in time. The previous versions of Galore will continue to be available in PLaneT.) These data structures are immutable, meaning that no operation alters its input. Instead, each operation produces a new data structure, often sharing some of its structure with its input. This means that old values remain valid until they are garbage collected and that these structures are safe for multi-threaded programs. Each data structure can have multiple implementations. The primary module for each data structure provides one or more constructors for each implementation, and operations which work for any of the implementations. Different implementations may operate on different kinds of data, provide different invariants, or have different time/space efficiency tradeoffs. The operations for each data structure are provided by names describing only their purpose (and not their kind of data structure). To prevent clashes with each other and with existing Scheme operations, it is recommended to import these modules with a prefix: (require (prefix set: (planet "set.ss" ("soegaard" "galore.plt" 3))) (prefix table: (planet "table.ss" ("soegaard" "galore.plt" 3)))) Data structures are implemented in the MzScheme class system (see documentation for class.ss). Each kind of data structure is an interface, each implementation is a class, and each operation is a method. Users can write their own implementation of Galore data structures by writing classes that implement the given interfaces. So long as new implementations adhere to the interfaces and expected invariants, the functions provided by Galore will work with them and they will be interoperable with all other Galore data structures. ---------------------------------------- Previous Version ---------------------------------------- Galore version 2 supports the following datastructures: _sets_, _bags_, _heaps_ _finite maps_, _priority queues_ _stacks_, _queues_, _deques_ For any datastructures or operations not yet provided by Galore version 3, use the previous version: (require (planet "FILE.scm" ("soegaard" "galore.plt" 2))) See the documentation for version 2 at: http://ja.soegaard.net/planet/html/soegaard/galore.plt/2/0/doc.txt ---------------------------------------- Notation ---------------------------------------- Specifications for Galore functions are written in the following form: (function-name [arg1 arg2 ...]) : (PolymorphicType TypeArg) arg1 : Type1 arg2 : Type2 The first line begins with a template for applying the function. The template names the function and its arguments. Square brackets enclose optional arguments (actual applications may provide zero, some, or none of these). Ellipses follow repeatable arguments to variable-arity functions, which may be provided zero or more times. The template is followed by the function's result type. Subsequent lines list the type of each argument. Polymorphic types are applied like functions; for instance, a list of symbols would be (Listof Symbol). ------------------------------------------------------------ _Galore Chapter 1: Sets_ ------------------------------------------------------------ A (Set Elem) is a collection of distinct elements, each of type Elem. Galore provides the following operations for creating and manipulating sets. ---------------------------------------- SET CONSTRUCTORS ---------------------------------------- A set implementation must have a way to distinguish its elements (how to tell whether one element is the same as another). Some implementations need to know more about their elements, and use this to provide better invariants. Each implementation provides two constructors. The first accepts set elements as additional arguments. The second accepts set elements as a list. For the following specifications, assume: elem : Elem elems : (Listof Elem) -------------------- Unordered sets require an equality predicate for their elements. Most single element operations will complete in linear time. Unordered sets have two constructors: > (make-unordered elem=? elem ...) : (Set Elem) > (list->unordered elem=? elems) : (Set Elem) elem=? : Elem Elem -> Boolean These operations construct a set containing the given elements as distinguished by elem=?. -------------------- Ordered sets require a total ordering for their elements. These orderings are represented as comparisons in the SRFI 67 style. A comparison produces 0 if its arguments are equal, 1 if the first argument is greater, and -1 if its second argument is greater. Most single element operations will complete in logarithmic time. Ordered sets have two constructors: > (make-ordered elem-compare elem ...) : (Set Elem) > (list->ordered elem-compare elems) : (Set Elem) elem-compare : Elem Elem -> -1 or 0 or 1 These operations construct a set containing the given elements as distinguished by elem-compare. Operations which traverse the set will traverse the elements in increasing order. -------------------- Hashed sets group their elements by hash code; the hash function must respect the given equality predicate (equal values must have the same hash code). Operations will take time logarithmic in the number of distinct hash codes and linear in the number of elements in a single hash code. Hashed sets have two constructors: > (make-hashed elem-hash elem=? elem ...) : (Set Elem) > (list->hashed elem-hash elem=? elems) : (Set Elem) elem-hash : Elem -> Integer elem=? : Elem Elem -> Boolean These operations construct a set containing the given elements as distinguished by elem=?. -------------------- Galore also provides six shortcut constructors for sets whose elements are distinguished by the standard Scheme comparisons eq?, eqv?, and equal?. > (make-eq elem ...) : (Set Elem) > (list->eq elems) : (Set Elem) These operations construct a hashed set containing the given elements as distinguished by eq?. > (make-eqv elem ...) : (Set Elem) > (list->eqv elems) : (Set Elem) These operations construct an unordered set containing the given elements as distinguished by eqv?. > (make-equal elem ...) : (Set Elem) > (list->equal elems) : (Set Elem) These operations construct a hashed set containing the given elements as distinguished by equal?. ---------------------------------------- SET ACCESSORS ---------------------------------------- These operations extract information from a set. For the following specifications, assume: elem : Elem set : (Set Elem) > (elements set) : (Listof Elem) Produces a list of all of set's elements. The elements are in increasing order for ordered sets. > (empty? set) : Boolean Reports whether set contains no elements. > (size set) : NaturalNumber Produces the number of elements in set. > (member? elem set) : Boolean Reports whether set contains elem. > (lookup elem set [failure-thunk success-function]) : T failure-thunk : -> T success-function : Elem -> T Finds an element in set that is equal to elem. If such an element e is found, lookup produces (success-function e). If no such element is found, lookup produces (failure-thunk). The default success-function returns e. The default failure-thunk returns #f. ---------------------------------------- SET UPDATERS ---------------------------------------- These operations construct updated forms of their input set. The result is always of the same implementation as the input. For the following specifications, assume: elem : Elem set : (Set Elem) > (insert elem set) : (Set Elem) Produces a set containing elem and all elements of set. > (remove elem set) : (Set Elem) Produces a set containing all elements of set except elem. > (clear set) : (Set Elem) Produces an empty set. ---------------------------------------- SET TRAVERSALS ---------------------------------------- These operations work on each element of a set. Each traversal works on elements in the same order they are produced by the elements operation. For the following specifications, assume: set : (Set Elem) predicate : Elem -> Boolean > (fold combine init set) : T combine : Elem T -> T init : T Builds a result from set's elements. If (elements set) is (list e1 e2 ... en), the result is (combine en ... (combine e2 (combine e1 init))...). > (for-each action set) : Void action : Elem -> Void Calls action for each element of set. > (filter predicate set) : (Set Elem) Produces a set containing the elements of set that satisfy predicate. The result is of the same implementation as set. > (andmap predicate set) : Boolean Reports whether all elements of set satify predicate. > (ormap predicate set) : Boolean Reports whether any element of set satisfies predicate. ---------------------------------------- SET COMBINATIONS ---------------------------------------- These operations represent set-theoretic operations combining multiple sets. In all cases, the result is of the same implementation as the first set argument. Multiple sets may only be combined if their notions of distinct elements are the same. If two sets with different equality are combined the result will be unpredictable and may yield an error. Here are some concrete examples: - (make-unordered = 1 2 3) and (make-ordered number-compare 4 5 6) may be combined, because = and number-compare represent the same equality relation. Both will distinguish numbers based on numerical equivalence. - (make-equal "A" "B" "C") and (make-eq "A" "B" "C") may not be combined, because equal? and eq? do not represent the same equality relation. Specifically, it is ambiguous whether both "A"s represent equivalent elements. For the following specifications, assume: set1 : (Set Elem) set2 : (Set Elem) combine : Elem Elem -> Elem > (union set1 set2 [combine]) : (Set Elem) Produces a set containing all elements present in either set1 or set2. If set1 and set2 contain equivalent elements e1 and e2 respectively, union adds (combine e1 e2) to the result in their place. The default combine returns e1. > (intersection set1 set2 [combine]) : (Set Elem) Produces a set containing all elements present in both set1 and set2. For each pair of equivalent elements e1 and e2 found in set1 and set2 respectively, intersection adds (combine e1 e2) to the result. The default combine returns e1. > (difference set1 set2) : (Set Elem) Produces a set containing all elements of set1 not found in set2. ------------------------------------------------------------ _Galore Chapter 2: Tables_ ------------------------------------------------------------ A (Table Key Value) is a finite mapping from distinct keys of type Key to distinct values of type Value. A single key/value pair in a table is called a binding. Galore provides the following operations for creating and manipulating tables. ---------------------------------------- TABLE CONSTRUCTORS ---------------------------------------- A table implementation must be able to distinguish its keys just like a set implementation must distinguish its elements. Like sets, tables have different implementations that require different information about their keys. Each implementation provides four constructors. The first accepts bindings as additional arguments, alternating keys and values. The second accepts bindings as an s-expression: a list of two-element lists, each a key followed by a value. The third accepts bindings as an association list of key/value pairs. The last accepts bindings as a list of keys and a separate list of their respective values. For the following specifications, assume: key : Key value : Value keys : (Listof Key) values : (Listof Value) sexp : (Listof (list Key Value)) alist : (Listof (cons Key Value)) -------------------- Unordered tables require an equality predicate for their keys. Most single-binding operations will complete in linear time. Unordered sets have four constructors: > (make-unordered key=? key value ...) : (Table Key Value) > (sexp->unordered key=? sexp) : (Table Key Value) > (alist->unordered key=? alist) : (Table Key Value) > (lists->unordered key=? keys values) : (Table Key Value) key=? : (Key Key -> Boolean) These operations construct a table containing the given bindings with keys distinguished by key=?. -------------------- Ordered tables require a total ordering for their keys. These orderings are represented as comparisons in the SRFI 67 style (see description for ordered sets, above). Most single-binding operations will complete in logarithmic time. Ordered tables have four constructors: > (make-ordered key-compare key value ...) : (Table Key Value) > (sexp->ordered key-compare sexp) : (Table Key Value) > (alist->ordered key-compare alist) : (Table Key Value) > (lists->ordered key-compare keys values) : (Table Key Value) key-compare : Key Key -> -1 or 0 or 1 These operations construct a table containing the given bindings with keys distinguished by key-compare. Operations which traverse the table will traverse the bindings in increasing order of their keys. -------------------- Hashed tables group their keys by hash code; the hash function must respect the given equality predicate (equal keys must have the same hash code). Operations will take time logarithmic in the number of distinct hash codes and linear in the number of keys in a single hash code. Hashed tables have four constructors: > (make-hashed key-hash key=? key value ...) : (Table Key Value) > (sexp->hashed key-hash key=? sexp) : (Table Key Value) > (alist->hashed key-hash key=? alist) : (Table Key Value) > (lists->hashed key-hash key=? keys values) : (Table Key Value) These operations construct a table containing the given bindings with keys distinguished by key=?. -------------------- Galore also provides shortcut constructors for tables whose keys are distinguished by the standard Scheme comparisons eq?, eqv?, and equal?. > (make-eq key value ...) : (Table Key Value) > (sexp->eq sexp) : (Table Key Value) > (alist->eq alist) : (Table Key Value) > (lists->eq keys values) : (Table Key Value) These operations construct a hashed table containing the given bindings with keys distinguished by eq?. > (make-eqv key value ...) : (Table Key Value) > (sexp->eqv sexp) : (Table Key Value) > (alist->eqv alist) : (Table Key Value) > (lists->eqv keys values) : (Table Key Value) These operations construct an unordered table containing the given bindings with keys distinguished by eqv?. > (make-equal key value ...) : (Table Key Value) > (sexp->equal sexp) : (Table Key Value) > (alist->equal alist) : (Table Key Value) > (lists->equal keys values) : (Table Key Value) These operations construct a hashed table containing the given bindings with keys distinguished by equal?. ---------------------------------------- TABLE ACCESSORS ---------------------------------------- These operations extract information from a table. For the following specifications, assume: key : Key table : (Table Key Velue) > (keys table) : (Listof Key) Produces a list of table's keys. For ordered tables, the keys are sorted from least to greatest. > (values table) : (Listof Value) Produces a list of table's values. The values appear in the same order as their respective keys in (keys table). > (to-sexp table) : (Listof (list Key Value)) Produces a list of table's bindings as two-element lists, each a key followed by its value. The bindings appear in the same order as their respective keys in (keys table). > (to-alist table) : (Listof (cons Key Value)) Produces an association list of table's bindings as key/value pairs. The bindings appear in the same order as their respective keys in (keys table). > (empty? table) : Boolean Reports whether table contains no bindings. > (size table) : NaturalNumber Produces the number of bindings in table. > (contains? key table) : Boolean Reports whether table contains key. > (lookup key table [failure-thunk success-function]) : T failure-thunk : -> T success-function : Value -> T Looks up the value bound to key in table. If such a value v is found, lookup preoduces (success-function v). If no such value is found, lookup produces (failure-thunk). The default success-function returns v. The default failure-thunk returns #f. ---------------------------------------- TABLE UPDATERS ---------------------------------------- These operations construct updated forms of their input table. The result is always of the same implementation as the input. For the following specifications, assume: key : Key value : Value table : (Table Key Value) > (insert key value table) : (Table Key Value) Produces a table in which key is bound to value and containing all other bindings of table. > (remove key table) : (Table Key Value) Produces a table containing all bindings of table except any binding for key. > (clear table) : (Table Key Value) Produces an empty table. ---------------------------------------- TABLE TRAVERSALS ---------------------------------------- These operations work on each binding of a table. Each traversal works on bindings in the same order their keys are produced by the keys operation. Each traversal has three forms. The basic form processes they key and value of each binding. The "/key" form processes only the key of each binding. The "/value" form processes only the value of each binding. For the following specifications, assume: set : (Set Elem) > (fold combine init table) : T > (fold/key combine/key init table) : T > (fold/value combine/value init table) : T combine : Key Value T -> T combine/key : Key T -> T combine/value : Value T -> T These operations build a result from table's bindings. If (keys table) produces (list k1 k2 ... kn) and (values table) produces (list v1 v2 ... vn), the respective results will be: (combine kn vn ... (combine k2 v2 (combine k1 v1 init))...) (combine/key kn ... (combine/key k2 (combine/key k1 init))...) (combine/value vn ... (combine/value v2 (combine/value v1 init))...) > (for-each action table) : T > (for-each/key action/key table) : T > (for-each/value action/value table) : T action : Key Value -> Void action/key : Key -> Void action/value : Value -> Void These operations call action (or action/key or action/value) for each binding in table. > (map transform table) : (Table Key Image) > (map/key transform/key table) : (Table Key Image) > (map/value transform/value table) : (Table Key Image) transform : Key Value -> Image transform/key : Key -> Image transform/value : Value -> Image These operations produce a table binding each key from table to the image of the previous binding produced by transform (or transform/key or transform/value). The result is of the same implementation as table. > (filter predicate table) : (Table Key Value) > (filter/key predicate/key table) : (Table Key Value) > (filter/value predicate/value table) : (Table Key Value) predicate : Key Value -> Boolean predicate/key : Key -> Boolean predicate/value : Value -> Boolean These operations produce a table containing each binding in table that satisfies predicate (or predicate/key or predicate/value). The result is of the same implementation as table. > (andmap predicate table) : Boolean > (andmap/key predicate/key table) : Boolean > (andmap/value predicate/value table) : Boolean predicate : Key Value -> Boolean predicate/key : Key -> Boolean predicate/value : Value -> Boolean Reports whether all bindings of table satisfy predicate (or predicate/key or predicate/value). > (ormap predicate table) : Boolean > (ormap/key predicate/key table) : Boolean > (ormap/value predicate/value table) : Boolean predicate : Key Value -> Boolean predicate/key : Key -> Boolean predicate/value : Value -> Boolean Reports whether any binding of table satisfies predicate (or predicate/key or predicate/value). ---------------------------------------- TABLE COMBINATIONS ---------------------------------------- These operations mimic set operations combining the bindings of multiple tables by their corresponding keys. In all cases, the result is of the same implementation as the first table argument. Multiple tables may only be combined if their notion of distinct keys are the same. If two tables with different equality are combined the result will be unpredictable and may yield an error. For concrete examples, see the section on Set Combinations above. For the following specifications, assume: table1 : (Table Key Value) table2 : (Table Key Value) combine : Key Value Value -> Value combine/value : Value Value -> Value > (union table1 table2 [combine]) : (Table Key Value) > (union/value table1 table2 [combine/value]) : (Table Key Value) These operations produce a table containing all bindings present in either table1 or table2. If table1 and table2 bind key k to values v1 and v2 respectively, union binds k to (combine k v1 v2) and union/value binds k to (combine/value v1 v2) in the result. The default combine and combine/value return v1. > (intersection table1 table2 [combine]) : (Table Key Value) > (intersection/value table1 table2 [combine/value]) : (Table Key Value) These operations produce a table containing all bindings present in both table1 and table2. For each key k bound to values v1 and v2 in table1 and table2 respectively, intersection binds k to (combine k v1 v2) and intersection/value binds k to (combine/value v1 v2) in the result. The default combine and combine/value return v1. > (difference table1 table2) : (Table Key Value) Produces a table containing all bindings of table1 whose key is not in table2. ------------------------------------------------------------ Acknowledgements ------------------------------------------------------------ - The red-black trees implementation started as a port of Jean-Christophe Filliatre's Ocaml implementation. - Versions 1 and 2 of Galore and its documentation were written entirely by Jens Axel S�gaard. - Many thanks to Richard Cobbe and Samuel Tobin-Hochstadt for helpful proofreading and functionality suggestions. ------------------------------------------------------------ References ------------------------------------------------------------ For more information on persistent data structures, see the following sources. [Ada] Adams, Stepehn, "Implementing Sets Efficiently in a Functional Language" [Mar] Martinez, Conrado and Roura, Salvador: "Randomized Binary Search Trees" [Oka] Chris Okasaki, "Purely Functional Data Structures" ------------------------------------------------------------ Contact Information ------------------------------------------------------------ For any questions not answered by this documentation, write to the PLT Scheme mailing list at: plt-scheme@list.cs.brown.edu Alternately, contact one of the authors: Carl Eastlund <cce@ccs.neu.edu> Jens Axel S�gaard <jensaxel@soegaard.net> Stevie Strickland <sstrickl@ccs.neu.edu> ------------------------------------------------------------ Copyright and License ------------------------------------------------------------ This documentation and all source code of Galore are published under the terms of the GNU LGPL. See license.txt and lgpl.txt in the Galore collection for more details. ------------------------------------------------------------ Keywords ------------------------------------------------------------ _galore_ _set_ _table_ _hash_ _association_ _binding_ _mapping_ _key_ _functional_ _persistent_ _immutable_