Benchmark: Performance Testing in Scheme

_Benchmark: Performance Testing in Scheme_
==========================================

By  Noel Welsh (noelwelsh at yahoo dot com)

This manual documents Benchmark version 1.0
Time-stamp: <2006-11-04 21:19:05 noel>

Keywords: _schemeunit_, _test_, _testing_, _unit testing_,
          _unit test_, _benchmark_, _performance_, _speed_



Introduction
============

This extension to SchemeUnit provides tools to answer two
types of questions that arise in optimising code:

1. Is method A faster than method B?  For example, is it
   faster to use the built-in map or to write a loop by
   hand?  

2. Is the current implementation faster than the previous
   one?

Answering the first question is intended for verifying
assumptions you make when optimising your code.  Generally
this code is quite short, so you can keep around both
versions.  You can ask this sort of question using the
check-faster? check.

Answering the second question is intended for ensuring that
your code is actually getting faster.  It is assumed you
can't keep around both versions of the code, so the runtime
of previous versions must be stored on disk.  You can ask
this sort of question by writing a benchmark-case, the
benchmarked form of a test-case.



Benchmark API
=============

PLaneT is the main distribution mechanism for Benchmark.  To
use Benchmark insert the require statement below, unless the
documentation for a specific section states otherwise.

  (require (planet "benchmark.ss" ("schematics" "benchmark.plt" 1)))


Checks
------

> (check-faster thunk1 thunk2 [message])
check-faster: (() -> any) (() -> any)) [string] -> #t

This check succeeds when the average runtime of thunk1 is
greater than the average runtime of thunk2.  Runtimes are
average over 10 runs.

Example: to check if map is faster than a hand written loop

(check-faster
  (lambda () (map foo a-big-list))
  (lambda ()
    (let loop ([ls a-big-list])
      (if (null? ls)
          null
          (cons (foo (car ls)) (loop (cdr ls)))))))


Test Case Extensions
--------------------

> (benchmark-case name expr ...)
syntax benchmark-case : string expression ...

This form acts exactly like SchemeUnit's test-case macro,
except runtime is measured and compared to previous records.
If the average over 10 runs is slower than the previous
average this test case fails.  If the test case succeeds,
the runtimes are appended to benchmark log file (see
this-expression-benchmark-log-file)

The name given to the benchmark test case is important, as
it is this name that is used to find previous records.

Example: to check you are improving the performance of your
procedure foo

(benchmark-case
 "foo"
 (foo some-big-input))


Miscellaneous
-------------

> (benchmark-test-case-log-file benchmark-test-case)
benchmark-test-case-log-file : benchmark-test-case -> string

Extracts the benchmark log file from a benchmark test case
constructed using benchmark-case

> (this-expression-benchmark-log-file)
syntax this-expression-benchmark-log-file

Macro that expands to the benchmark log file for this
expression.  Benchmark log files are created on a per file
basis.

> (collect-and-time thunk)
collect-and-time : (() -> any) -> (vectorof integer)

Collects garbage before running thunk and records the 10
runtimes (cpu time).  Result is a vector with 10 elements.

> (check-faster-times times1 times2)
check-faster-times : (vectorof integer) (vectorof integer) -> #t

#t if thunk1 is faster, on average, than thunk2, otherwise
 raises exn:check:fail with useful information pushed on the
 check-info stack.  Note: a function, not a check

> (faster? thunk1 thunk2)
faster? : (() -> any) (() -> any) -> (U #t #f)

#t if thunk1 is faster, on average, than thunk2.  #f
otherwise.


Benchmark Log Files
-------------------

To access these values require benchmark-log.ss

> (read-benchmark-log file-name)
read-benchmark-log : (U string path) -> (listof run)

Reads the give file-name returning the contents (which
should be a list of runs)

> (find-most-recent-run file-name name)
find-most-recent-run : (U string path) string -> (U run #f)

Find the most recent run for the benchmark case with the
given name, or #f if no run is found.

> (add-run file-name name times)
add-run : (U string path) (vectorof time) -> void

Add a run to the given benchmark log file

> (run-name run)
run-name : run -> string

> (run-times run)
run-times : run -> (vectorof integer)


Benchmark files are stored in a simple sexpression format:
  ;; A benchmark log is a (listof run)

  ;; A run is (vector name description seconds (vectorof time) mean std-dev)

  ;; name is a string, the name of the test-case the timing is for

  ;; description is a string, describing the changes implemented in this run
  
  ;; seconds is an integer, the value of current-seconds at
  ;; the time the data was first written

  ;; time is an integer, the runtime of the test

  ;; mean is a number, the mean of the times

  ;; std-dev is a number, the standard deviation of the times

Note the description field.  This is provided for you to
note down the methods you used to increase the speed of your
benchmarks.  At the moment you have to do edit it by hand.

At the moment the benchmark log file for a file is simply
the file name with .benchmark-log appended.