#lang scribble/manual @(require planet/scribble scribble/eval scribble/struct racket/sandbox "lump.rkt" (for-label racket "filetransfer.rkt" "display-measures.rkt" "lump.rkt")) @(define ex-eval (make-base-eval)) @interaction-eval[#:eval ex-eval (require "lump.rkt" "filetransfer.rkt" "display-measures.rkt")] @title{P2P Tools} @author+email["Erich Rast" "erich 'at snafu.de"] @section{The LUMP Messaging Protocol} @defmodule/this-package[lump] LUMP is a simple binary protocol for sending messages with arbitrary number of arguments over reliable stream ports. @subsection{Messages} The main data structure of LUMP are messages. These have an id, a referer field, some internal flags, and an argument list. @defstruct[message ([id (integer-in 0 65535)] [seqnum (integer-in 0 4294967295)] [referer (integer-in 0 65535)] [flags (integer-in 0 15)] [version? (integer-in 1 16)] [args (listof/c lump-argument-type?)])] Instead of the default message constructor, use @racket[new-message] or @racket[new-response] for creating new messages. @subsection{Opening and Closing Sessions} In LUMP messages carry a sequence number that is increased per session. You first need to create a new session before sending any messages. @defproc[(new-session) session?]{ Creates a new session.} After all messages associated to one session have been sent, you should use @racket[close-session] to close it: @defproc[(close-session [session session?]) void/c ]{ Closes the session.} @subsection{Creating and Writing Messages} @defproc[(new-message [id (integer-in 0 65535)] [args lump-argument-type? none/c] ...) message?]{ Creates a new message with given command id and an arbitrary number of arguments.} @examples[#:eval ex-eval (new-message 2 "hello world!" (typed type:int32 7267) #(26 "John"))] @defproc[(new-response [id (integer-in 1 65535)] [referer-message message?] [args lump-argument-type? none/c] ...) message?]{ Creates a response to a given referer message with a given command id and arbitrary many arguments. The @racket[referer] field of the message returned contains the @racket[seqnum] of the original message.} @defproc[(write-message [session session?] [message message?] [out-stream port? (current-output-port)]) number?]{Write a message within a given session to a port. Returns the sequence number (@racket[seqnum]) of the message within the given session. This procedure will block until the message has been written, but does not call @racket[flush-output] immediately. Call @racket[flush-output] to ensure that the message is actually written to a block-buffered port.} @subsection{Receiving Messages} @defproc[(read-message [in-stream port? (current-input-port)] [version-check-proc (number? .->. any/c) (lambda (version) (when (> version protocol-version) (raise-argument-error 'read-message (format "LUMP protocol version ~a or lower" protocol-version) version)))]) message?]{ Synchroneously reads a message from the given port, or the current input port if no port is given, and returns it. The version check procedure can be used to test a given version number against the curent @racket[protocol-version]. The version check procedure is called after the whole header has been read, but just before any referer or data fields follow, and the stream might be in an undefined state after the check has failed.} @defthing[protocol-version (integer-in 1 16)] The current version of the protocol. This value is independent of the LUMP library version and will only be increased when a new version is not downwards compatible to previous versions. LUMP is too low level to provide full upwards and downwards compatibility. It is advisable to never use sender and receivers with different protocol versions and define your own upwards and downwards compatible versioning schemes on top of LUMP serialization if you need it. @subsection{Supported Data Types} LUMP supports numbers, strings, bytes, lists, symbols, and vectors as external argument types that are automatically serialized. In addition to these Racket data types, LUMP also supports the following low-level data types: @(make-table 'boxed (list (list (make-flow (list (make-paragraph (list @bold{Type})))) (make-flow (list (make-paragraph (list @bold{Internal})))) (make-flow (list (make-paragraph (list @bold{Bytesize})))) (make-flow (list (make-paragraph (list @bold{Explanation}))))) (list (make-flow (list (make-paragraph (list @tt|{bool}|)))) (make-flow (list @t{0})) (make-flow (list @t{1})) (make-flow (list @t{Boolean value as a byte}))) (list (make-flow (list (make-paragraph (list @tt{int8})))) (make-flow (list @t{2})) (make-flow (list @t{1})) (make-flow (list @t{Unsigned Byte}))) (list (make-flow (list (make-paragraph (list @tt{int16})))) (make-flow (list @t{3})) (make-flow (list @t{2})) (make-flow (list @t{Signed Short}))) (list (make-flow (list (make-paragraph (list @tt{uint16})))) (make-flow (list @t{4})) (make-flow (list @t{2})) (make-flow (list @t{Unsigned Short}))) (list (make-flow (list (make-paragraph (list @tt{int32})))) (make-flow (list @t{5})) (make-flow (list @t{4})) (make-flow (list @t{Signed 32-bit Integer}))) (list (make-flow (list (make-paragraph (list @tt{uint32})))) (make-flow (list @t{6})) (make-flow (list @t{4})) (make-flow (list @t{Unsigned 32-bit Integer}))) (list (make-flow (list (make-paragraph (list @tt{int64})))) (make-flow (list @t{7})) (make-flow (list @t{8})) (make-flow (list @t{Signed 64-bit Integer}))) (list (make-flow (list (make-paragraph (list @tt{uint64})))) (make-flow (list @t{8})) (make-flow (list @t{8})) (make-flow (list @t{Unsigned 64-bit Integer}))) (list (make-flow (list (make-paragraph (list @tt{text})))) (make-flow (list @t{9})) (make-flow (list @t{n/a})) (make-flow (list @t{UTF-8 String (varying length)}))) (list (make-flow (list (make-paragraph (list @tt{symbol})))) (make-flow (list @t{10})) (make-flow (list @t{n/a})) (make-flow (list @t{Racket symbol (varying length)}))) (list (make-flow (list (make-paragraph (list @tt{list})))) (make-flow (list @t{11})) (make-flow (list @t{n/a})) (make-flow (list @t{List (varying length)}))) (list (make-flow (list (make-paragraph (list @tt{number})))) (make-flow (list @t{12})) (make-flow (list @t{n/a})) (make-flow (list @t{Number (varying length)}))) (list (make-flow (list (make-paragraph (list @tt{bytes})))) (make-flow (list @t{13})) (make-flow (list @t{n/a})) (make-flow (list @t{Bytes (varying length)}))) (list (make-flow (list (make-paragraph (list @tt{vector})))) (make-flow (list @t{14})) (make-flow (list @t{n/a})) (make-flow (list @t{Vector (varying length)}))) )) (The values in the Internal column are internal identifiers that are only needed if you want to implement the protocol in another language.) To use an internal, low-level type you need to prefixed the identifier with @racket[type:], so for example @racket[type:uint32] is the type for an unsigned 32-bit integer. Use the following procedures to wrap Racket data into a @racket[typed] structure that can be provided to @racket[new-message] or @racket[new-response]. @defproc[(typed [type lump-internal-type?] [datum lump-external-type?]) typed?]{ Returns a structure that stores an external type as an explicitly provided internal type. Use this for wrapping Racket numbers of the given byte range into smaller fixed-size internal types.} @examples[#:eval ex-eval (typed type:uint8 255)] @defproc[(untype [typed-structure typed?]) lump-external-type?]{ Converts a typed structure into a Racket data type, where fixed-size numbers are converted to exact Racket numbers.} @examples[#:eval ex-eval (untype (typed type:uint8 255))] @defproc[(lump-internal-type? [datum any/c]) boolean?]{ Returns true if the given datum represents an internal LUMP type identifier, false otherwise.} @examples[#:eval ex-eval (lump-internal-type? type:text) (lump-internal-type? 10) (lump-internal-type? 130) (lump-internal-type? "John")] @defproc[(lump-external-type? [datum any/c]) boolean?]{ Returns true if the given datum has a valid LUMP external data type, i.e. a Racket type that can be serialized and is not a @racket[typed] structure, false otherwise. Elements of lists or vectors are not checked by this procedure.} @examples[#:eval ex-eval (lump-external-type? "John") (lump-external-type? '("John" "Mary" "Brian")) (lump-external-type? (typed type:int32 2728)) (lump-external-type? (box 10))] @defproc[(lump-argument-type? [datum any/c]) boolean?]{ Returns true if the given datum can be used as an argument to @racket[new-message] or @racket[new-response] and serialized using the LUMP protocol, false otherwise. Elements of lists or vectors are not checked by this procedure.} @examples[#:eval ex-eval (lump-argument-type? "John") (lump-argument-type? '("John" "Mary" "Brian")) (lump-argument-type? (typed type:int32 2728)) (lump-argument-type? (box 10))] @subsection{Description of the LUMP Protocol} You do not need to know the internals of the protocol in order to use it, but the following information is useful if you would like to implement a receiver or sender in another programming language. All numeric values represented by multiple bytes are in little endian format. The structure of a serialized LUMP message is as follows: @itemlist[ @item{Header (7 bytes): @itemlist[ @item{4 bits (MSBs of the byte): Protocol version - a value from 0 to 15 representing the internal protocol version number - 1} @item{4 bits (LSBs of the byte): Flags - 4 bits of flags, currently 2 of them are used: @itemlist[ @item{Bit 0: the message has a data portion} @item{Bit 1: the message has a referer field}]} @item{2 bytes: Message Id} @item{4 bytes: Sequence number of the message} ]} @item{Referer (4 bytes, only present if referer flag is set): Sequence number of the referer message} @item{Data Portion (varying size, only present if data flag is set): @itemlist[ @item{2 bytes: Number @italic{argnum} of following arguments} @item{For each argument 1...@italic{argnum}: @itemlist[ @item{1 byte: LUMP internal argument type of the argument} @item{4 bytes: length @italic{arglen} of the following data in bytes} @item{1....@italic{arglen} bytes: data of the argument}]} ]} ] @section{The File Transfer Library} @defmodule/this-package[filetransfer] This library allows you to transfer files over TCP connections. To transfer a file, the receiver must first open a listener and then the sender must send the file. Sending and receiving is asynchronous, using procedures as event callbacks. @subsection{Receiving Files} @defproc[(start-listen [local-port (and/c exact-nonnegative-integer? (integer-in 0 65535))] [save-path path?] [from-ip (or/c string? boolean?)] [progress-proc ([phase (one-of/c 'listening 'preparing 'receiving)] [path path?] [progress number?] .->. any/c)] [final-proc ([error-code (one-of/c 'finished 'error)] [path path?] [total-milliseconds number?] [total-bytes-read number] .->. any/c)] [timeout real? 60.0] [file-table (and/c hash? hashtable-mutable?) (make-hash)] [listen-timeout real? 604800.0]) filetransfer?]{ Starts to listen for incoming connections on the given port, where results will be saved to the folder indicated by @racket[save-path] and @racket[from-ip] specifies the IP address of the remote host from which a connection is accepted or @racket[#f] if any connection is to be accepted. The optional @racket[progress-proc] and @racket[final-proc] will be run for each transfer and may be used for keeping track of progress and when a transfer has finished. The optional @racket[timeout] value indicates the time until a network operation will fail if no progress has been made during that interval. The optional @racket[file-table] is used to store concatenated checksum+suggested filename values used for checking whether a transfer is to be resumed. For automatic resuming to work, you need to store this table after partial transfers and provide it again in subsequent transfers. The option @racket[listen-timeout] value indicates the time the receiver should wait for an incoming connection; if no connection is established within that period, the receiver stops listening.} @subsection{Sending Files} @defproc[(send-file [remote-hostname string?] [remote-port (and/c exact-nonnegative-integer? (integer-in 0 65535))] [file-path path?] [suggested-filename string?] [progress-proc ([phase (one-of/c 'connecting 'preparing 'sending)] [path path?] [progress number?] .->. any/c)] [final-proc ([error-code (one-of/c 'finished 'error)] [path path?] [total-milliseconds number?] [total-bytes-read number] .->. any/c)] [timeout real? 60.0]) filetransfer?]{ Sends the file at the given path to the given remote host at remote port. The suggested file name indicates to the receiver how to name the file and is also used in determining whether a previously interupted file transfer ought to be resumed. However, it is up to the receiver how to actually name the file. The progress and final procedures can be used to track the progress of the file transfer and the optional timeout value indicates the time in seconds until a network operation fails if no progress has been made during that period. (Currently, not all network operations implement the timeout.)} @subsection{Interrupting Transfers} @defproc[(filetransfer? [datum any/c]) boolean?]{ Returns @racket[#t] if the given datum is a filetransfer object, @racket[#f] otherwise. Filetransfer objects are returned by @racket[start-listen] and @racket[send-file].} @defproc[(kill-transfer [transfer filetransfer?]) any/c]{ Kills the current file transfer, disconnecting the remote host immediately and killing any transfer threads. After this function has been called, the socket used for tranferring files might be in an unusable state for some time on some systems.} @defproc[(finish-transfer [transfer filetransfer?] [timeout real? 259200.0]) any/c]{ Blocks until the current transfer represented by @racket[transfer] has finished, and then disconnects from the remote host. When the timeout value is reached while the transfer is in place, the transfer is killed immediately.} @defproc[(wait-transfer [transfer filetransfer?] [timeout real? 259200.0]) any/c]{ Waits until the given transfer has finished.} @subsection{Resuming Transfers} File transfers are resumed automatically as long as the same file-hash table and suggested names are used for both transfers. When a file is sent by @racket[send-file] the sender produces a hash value retrieved from the mid of the file and a suggested file name. These are stored by the receiver in the @racket[file-table] hash. If the same file paths and file-table are used for a file transfer on the receiving side, and the sender uses the same suggested file name both times, a file transfer that has previously been interrupted and succeeded only partially will automatically be resumed. @subsection{Description of the Transfer Protocol} You do not need to know the internals of the protocol in order to use it, but the following information is useful if you would like to implement a receiver or sender in another programming language. All numeric values represented by multiple bytes are in little endian format. The protocol involves a single handshake reply from the receiver to indicate the position from which the file transfer is to be resumed. A complete transmission of a file works as follows: @itemlist[#:style 'ordered @item{The sender sends 20 bytes of checksum data based on an sha1 hash of 16384 bytes from the middle of the file or less if the file is shorter.} @item{The sender sends a 2 bytes value @italic{namelen} representing the length of the suggested file name, followed by @italic{namelen} bytes of name data in UTF-8 format.} @item{The receiver then replies with an 8 bytes value @italic{offset} into the file that indicates the position from which to continue the transfer (i.e. @italic{offset} will be 0 if no portion of the file has been received yet).} @item{The sender then sends an 8 bytes value @italic{datalen}, followed by @italic{datalen} bytes of file content. The content sent starts at position @italic{offset} of the file and the total size of the file should be @italic{datalen}+@italic{offset}.} ] @section{Display Measures} @defmodule/this-package[display-measures] This library contains utility functions for displaying transfer speeds. The data rate functions using a power of 10 measure for the amount of data are recommended. @deftogether[(@defproc[(bits/sec->data-rate-string [bit number?] [sec number?] [decimals positive? 1]) string?] @defproc[(bytes/sec->data-rate-string [byte number?] [sec number?] [decimals positive? 1]) string?] @defproc[(bytes/msec->data-rate-string [byte number?] [msec number?] [decimals positive? 1]) string?])] Produce a string that expresses the data rate measured in bits per second, bytes per second, and bytes per millisecond respectively up to the given decimals precision, using power of 10 measures for the amount of data transferred. @examples[#:eval ex-eval (bits/sec->data-rate-string 0 0) (bits/sec->data-rate-string 1817 2) (bytes/sec->data-rate-string 1024 1) (bytes/sec->data-rate-string 18161881 2 2)] @deftogether[(@defproc[(bytes/sec->binary-rate-string [byte number?] [sec number?] [decimals positive? 1]) string?] @defproc[(bytes/sec->binary-rate-string* [byte number?] [sec number?] [decimals positive? 1]) string?] @defproc[(bytes/msec->binary-rate-string [byte number?] [sec number?] [decimals positive? 1]) string?] @defproc[(bytes/msec->binary-rate-string* [byte number?] [sec number?] [decimals positive? 1]) string?])] Produce a rate string based on power of 2 measures of data with the appropriate scale such as e.g. "8.0 KiB/s" up to the given decimals precision, where the first two functions use seconds and the last two functions use milliseconds as the basis, and the starred versions produce the slightly incorrect but more common measures "KB/s", "MB/s", etc. @examples[#:eval ex-eval (bytes/sec->binary-rate-string 17161817 3 2) (bytes/sec->binary-rate-string 179171618151816786676278287 2) (bytes/sec->binary-rate-string* 17161816719817198 1 1) (bytes/msec->binary-rate-string* 18161881 2 2)]