1 Sound Control
rsound-play
rsound-loop
change-loop
stop-playing
2 Sound I/ O
rsound-read
rsound-read/ clip
read-rsound-frames
read-rsound-sample-rate
rsound-write
3 Rsound Manipulation
rsound
make-silence
rsound-ith/ left
rsound-ith/ right
rsound-clip
rsound-append*
rsound-overlay*
fun->mono-rsound
fun->stereo-rsound
4 Signals
sine-wave
sawtooth-wave
square-wave
dc-signal
fader
signal
signal-+ s
signal-*s
5 Visualizing Rsounds
rsound-draw
rsound-fft-draw
vector-pair-draw/ magnitude
vector-draw/ real/ imag
6 RSound Utilities
make-harm3tone
rsound-fft/ left
rsound-fft/ right
7 Reporting Bugs

RSound: An Adequate Sound Engine for Racket

John Clements <clements@racket-lang.org>

 (require (planet clements/rsound))
This collection provides a means to represent, read, write, play, and manipulate sounds. It uses the ’portaudio’ library, which appears to run on Linux, Mac, and Windows.
The package contains binary versions of the Mac & Windows portaudio libraries. This is because Windows and Mac users are less likely to be able to install their own versions of the library; naturally, this is a less-than-perfect solution. In particular, it appears that Windows users often get an error message about a missing DLL that can be solved by installing a separate bundle... from Microsoft?
Sound playing happens on a separate racket thread and custodian. This means that re-running the program or interrupting with a "Kill" will not halt the sound. (Use (stop-playing) for that.)
It represents all sounds internally as stereo 16-bit PCM, with all the attendant advantages (speed, mostly) and disadvantages (clipping).

Does it work on your machine? Try this example:
  (require (planet ("main.rkt" ("clements" "rsound.plt"))))
  
  (rsound-play ding)

1 Sound Control

These procedures start and stop playing sounds and loops.

(rsound-play rsound)  void?
  rsound : rsound?
Plays an rsound. Interrupts an already-playing sound, if there is one.

(rsound-loop rsound)  void?
  rsound : rsound?
Plays an rsound repeatedly. Continues looping until interrupted by another sound command.

(change-loop rsound)  void?
  rsound : rsound?
When the current sound or loop finishes, starts looping this one instead.

(stop-playing)  void
Stop the currently playing sound.

2 Sound I/O

These procedures read and write rsounds from/to disk.

The RSound library reads and writes WAV files only; this means fewer FFI dependencies (the reading & writing is done in racket), and works on all platforms.

(rsound-read path)  rsound?
  path : path-string?
Reads a WAV file from the given path, returns it as an rsound.

It currently has lots of restrictions (it insists on 16-bit PCM encoding, for instance), but deals with a number of common bizarre conventions that certain WAV files have (PAD chunks, extra blank bytes at the end of the fmt chunk, etc.), and tries to fail relatively gracefully on files it can’t handle.

Reading in a large sound can result in a very large value (~10 Megabytes per minute); for larger sounds, consider reading in only a part of the file, using rsound-read/clip.

(rsound-read/clip path start finish)  rsound?
  path : path-string?
  start : nonnegative-integer?
  finish : nonnegative-integer?
Reads a portion of a WAV file from a given path, starting at frame start and ending at frame finish.

It currently has lots of restrictions (it insists on 16-bit PCM encoding, for instance), but deals with a number of common bizarre conventions that certain WAV files have (PAD chunks, extra blank bytes at the end of the fmt chunk, etc.), and tries to fail relatively gracefully on files it can’t handle.

(read-rsound-frames path)  nonnegative-integer?
  path : path-string?
Returns the number of frames in the sound indicated by the path. It parses the header only, and is therefore much faster than reading in the whole sound.

The file must be encoded as a WAV file readable with rsound-read.

(read-rsound-sample-rate path)  number?
  path : path-string?
Returns the sample-rate of the sound indicated by the path. It parses the header only, and is therefore much faster than reading in the whole sound.

The file must be encoded as a WAV file readable with rsound-read.

(rsound-write rsound path)  void?
  rsound : rsound?
  path : path-string?
Writes an rsound to a WAV file, using stereo 16-bit PCM encoding. It overwrites an existing file at the given path, if one exists.

3 Rsound Manipulation

These procedures allow the creation, analysis, and manipulation of rsounds.

(struct rsound (data frames sample-rate)
  #:extra-constructor-name make-rsound)
  data : s16vector?
  frames : nonnegative-integer?
  sample-rate : nonnegative-number?
Represents a sound.

(make-silence frames sample-rate)  rsound?
  frames : nonnegative-integer?
  sample-rate : nonnegative-number?
Returns an rsound of length frames containing silence. This procedure is relatively fast.

(rsound-ith/left rsound frame)  nonnegative-integer?
  rsound : rsound?
  frame : nonnegative-integer?
Returns the nth sample from the left channel of the rsound, represented as a number in the range -1.0 to 1.0.

(rsound-ith/right rsound frame)  nonnegative-integer?
  rsound : rsound?
  frame : nonnegative-integer?
Returns the nth sample from the right channel of the rsound, represented as a number in the range -1.0 to 1.0.

(rsound-clip rsound start finish)  rsound?
  rsound : rsound?
  start : nonnegative-integer?
  finish : nonnegative-integer?
Returns a new rsound containing the frames in rsound from the startth to the finishth - 1. This procedure copies the required portion of the sound.

(rsound-append* rsounds)  rsound?
  rsounds : (listof rsound?)
Returns a new rsound containing the given rsounds, appended sequentially. This procedure is relatively fast. All of the given rsounds must have the same sample-rate.

(rsound-overlay* assembly-list)  rsound?
  assembly-list : (listof (list/c rsound? nonnegative-integer?))
Returns a new rsound containing all of the given rsounds. Each sound begins at the frame number indicated by its associated offset. The rsound will be exactly the length required to contain all of the given sounds.

So, suppose we have two rsounds: one called ’a’, of length 20000, and one called ’b’, of length 10000. Evaluating

  (rsound-overlay* (list (list a 5000)
                         (list b 0)
                         (list b 11000)))

... would produce a sound of 21000 frames, where each instance of ’b’ overlaps with the central instance of ’a’.

(fun->mono-rsound frames sample-rate fun)  rsound?
  frames : nonnegative-integer?
  sample-rate : nonnegative-integer?
  fun : signal?
Builds a sound of length frames and sample-rate sample-rate by calling fun with integers from 0 up to frames-1. The result should be an inexact number in the range -1.0 to 1.0. Values outside this range are clipped. Both channels are identical.

(fun->stereo-rsound frames    
  sample-rate    
  left-fun    
  right-fun)  rsound?
  frames : nonnegative-integer?
  sample-rate : nonnegative-integer?
  left-fun : signal?
  right-fun : signal?
Builds a stereo sound of length frames and sample-rate sample-rate by calling left-fun and right-fun with integers from 0 up to frames-1. The result should be an inexact number in the range -1.0 to 1.0. Values outside this range are clipped.

4 Signals

A signal is a function mapping a frame number to a real number in the range -1.0 to 1.0. There are several built-in functions that produce signals.

(sine-wave frequency sample-rate)  signal?
  frequency : nonnegative-number?
  sample-rate : nonnegative-number?
Produces a signal representing a sine wave of the given frequency, of amplitude 1.0.

(sawtooth-wave frequency sample-rate)  signal?
  frequency : nonnegative-number?
  sample-rate : nonnegative-number?
Produces a signal representing a naive sawtooth wave of the given frequency, of amplitude 1.0. Note that since this is a simple -1.0 up to 1.0 sawtooth wave, it’s got horrible aliasing all over the spectrum.

(square-wave frequency sample-rate)  signal?
  frequency : nonnegative-number?
  sample-rate : nonnegative-number?
Produces a signal representing a naive square wave of the given frequency, of amplitude 1.0. Note that since this is a simple 1/-1 square wave, it’s got horrible aliasing all over the spectrum.

(dc-signal amplitude)  signal?
  amplitude : real?
Produces a constant signal at amplitude. Inaudible unless used to multiply by another signal.

(fader fade-samples)  signal?
  fade-samples : number?
Produces a signal that decays exponentially. After fade-samples, its value is 0.001. Inaudible unless used to multiply by another signal.

(signal proc args ...)  signal?
  proc : procedure?
  args : (listof any/c)
Produces a signal whose values are computed by calling proc with the current frame and the additional values args.

So, for instance, if we defined the function flatline as

  (define (flatline t l)
    l)

... then (signal flatline 0.4) would produce the same result as (dc-signal 0.4).

There are also a number of functions that combine existing signals, called "signal combinators":

(signal-+s signals)  signal?
  signals : (listof signal?)
Produces the signal that is the sum of the input signals.

(signal-*s signals)  signal?
  signals : (listof signal?)
Produces the signal that is the product of the input signals.

5 Visualizing Rsounds

 (require (planet clements/rsound/draw))

(rsound-draw rsound    
  #:title title    
  [#:width width    
  #:height height])  void?
  rsound : rsound?
  title : string?
  width : nonnegative-integer? = 800
  height : nonnegative-integer? = 200
Displays a new window containing a visual representation of the sound as a waveform.

(rsound-fft-draw rsound    
  #:zoom-freq zoom-freq    
  #:title title    
  [#:width width    
  #:height height])  void?
  rsound : rsound?
  zoom-freq : nonnegative-real?
  title : string?
  width : nonnegative-integer? = 800
  height : nonnegative-integer? = 200
Draws an fft of the sound by breaking it into windows of 2048 samples and performing an FFT on each. Each fft is represented as a column of gray rectangles, where darker grays indicate more of the given frequency band.

(vector-pair-draw/magnitude left    
  right    
  #:title title    
  [#:width width    
  #:height height])  void?
  left : (vectorof complex?)
  right : (vectorof complex?)
  title : string?
  width : nonnegative-integer? = 800
  height : nonnegative-integer? = 200
Displays a new window containing a visual representation of the two vectors’ magnitudes as a waveform. The lines connecting the dots are really somewhat inappropriate in the frequency domain, but they aid visibility....

(vector-draw/real/imag vec    
  #:title title    
  [#:width width    
  #:height height])  void?
  vec : (vectorof complex?)
  title : string?
  width : nonnegative-integer? = 800
  height : nonnegative-integer? = 200
Displays a new window containing a visual representation of the vector’s real and imaginary parts as a waveform.

6 RSound Utilities

(make-harm3tone frequency    
  volume?    
  frames    
  sample-rate)  rsound?
  frequency : nonnegative-number?
  volume? : nonnegative-number?
  frames : nonnegative-integer?
  sample-rate : nonnegative-number?
Produces an rsound containing a semi-percussive tone of the given frequency, frames, and volume. The tone contains the first three harmonics of the specified frequency. This function is memoized, so that subsequent calls with the same parameters will return existing values, rather than recomputing them each time.

(rsound-fft/left rsound)  (vectorof complex?)
  rsound : rsound?
Produces the complex-valued vector that represents the fourier transform of the rsound’s left channel. Since the FFT takes time N*log(N) in the size of the input, running this on rsounds with more than a few thousand frames is probably going to be slow, unless the number of frames is a power of 2.

(rsound-fft/right rsound)  (vectorof complex?)
  rsound : rsound?
Produces the complex-valued vector that represents the fourier transform of the rsound’s right channel. Since the FFT takes time N*log(N) in the size of the input, running this on rsounds with more than a few thousand frames is probably going to be slow, unless the number of frames is a power of 2

not-yet-documented: (provide twopi make-tone make-squaretone ding make-ding split-in-4 times vectors->rsound echo1 fft-complex-forward fft-complex-inverse)

7 Reporting Bugs

For Heaven’s sake, report lots of bugs!