On this page:
8.1 Variables
8.1.1 The variable Structure
variable
make-variable
variable-initialized?
variable-synchronize!
variable-minimum
variable-maximum
variable-n
variable-sum
variable-mean
variable-variance
variable-standard-deviation
8.1.2 Tally and Accumulate
tally
accumulate
8.1.3 Variable Statistics
statistics
statistics-accumulate!
statistics-tally!
statistics-mean
statistics-mean-square
statistics-variance
statistics-standard-deviation
8.1.4 Variable History
history
history-accumulate!
history-tally!
8.1.5 History Graphics
history-plot
8.1.6 Variable Monitors
8.2 Example—Tally and Accumulate Example
8.3 Example—Data Collection
8.4 Data Collection Across Multiple Simulation Runs
8.4.1 Open Loop Processing
8.4.2 Example—Open Loop Processing
8.4.3 Closed Loop Processing
8.4.4 Example—Closed Loop Processing

8 Data Collection

    8.1 Variables

      8.1.1 The variable Structure

      8.1.2 Tally and Accumulate

      8.1.3 Variable Statistics

      8.1.4 Variable History

      8.1.5 History Graphics

      8.1.6 Variable Monitors

    8.2 Example—Tally and Accumulate Example

    8.3 Example—Data Collection

    8.4 Data Collection Across Multiple Simulation Runs

      8.4.1 Open Loop Processing

      8.4.2 Example—Open Loop Processing

      8.4.3 Closed Loop Processing

      8.4.4 Example—Closed Loop Processing

The purpose of most simulation models is to collect data to analyze to gain insights into the system being simulated. In the simulation collection, (numeric) data that is subject to automatic data collection is stored in variable structure instances (i.e. variables).

Data for a variable may either be collected in a time-dependent manner, specified using the accumulate macro, or in a time-independent manner, specified using the tally macro.

Currently, both statistical data and history data may be automatically collected for a variable. (Both may in turn be either time-dependent or time-independent.) History data allows more more sophisticated analysis to be performed on the data using other analysis tool (e.g. the statistics routines in the science collection). Also, a function to plot history data is provided.

8.1 Variables

A variable represents a numeric variable in the model for which data can automatically be collected as specified by the model developer.

8.1.1 The variable Structure

(struct variable (initial-value
    value
    time-last-synchronized
    statistics
    history
    continuous
    state-index
    get-monitors
    set-monitors)
  #:mutable)
  initial-value : (or/c 'uninitialized real?)
  value : (or/c 'uninitialized real?)
  time-last-synchronized : (>=/c 0.0)
  statistics : (or/c statistics? false/c)
  history : (or/c history? false/c)
  continuous : boolean?
  state-index : (or/c -1 exact-nonegative-integer?)
  get-monitors : list?
  set-monitors : list?
Instances of the variable structure represent variable in the simulation model. The variable structure has the following fields:
  • initial-valueThe initial value of the variable. This is not currently being used, but may be used in the future to reset variables.

  • valueThe current value of the variable.

  • time-last-synchronizedThe time the variable was last synchronized. This is used internally to implement time-dependent data collectors.

  • statisticsThe statistics data collector for the variable or #f.

  • historyThe history data collector for the variable or #f.

  • continuous?True, #t, if the variable is a continuous variable or false, #f, otherwise.

  • state-indexThe index for the variable in the state vector or -1 if the variable is not a continuous variable or is not currently allocated to the state vector (i.e., the process owning the continuous variable is not currently in a work/continuously). (See Chapter 10, Continuous Simulation Models)

  • get-monitorsA list of get monitors for the variable.

  • set-monitorsA list of set monitors for the variable.

(make-variable [initial-value])  variable?
  initial-value : (or/c (symbols uninitialized) real?)
   = 'uninitialized
Returns a newly created variable with the specified initial-value. If initial-value is not provided, 'uninitialized is used indicating that the variable has no value.

By default, all variables accumulate statistics on their values. To turn this off, set the statistics field to #f using (set-variable-statistics! variable #f).

(variable-initialized? variable)  boolean?
  variable : variable?
Returns true, #t, if variable is initialized (i.e., its value is not 'uninitialized).

(variable-synchronize! variable)  void?
  variable : variable?
Synchronizes variable to the current (simulated) time.

The following functions are shortcuts to the statistics for a variable. They will error if there are no associated variables.

(variable-minimum variable)  real?
  variable : variable?
Returns the minimum value of variable.

(variable-maximum variable)  real?
  variable : variable?
Returns the maximum value of variable.

(variable-n variable)  real?
  variable : variable?
If variable is time-dependent, returns the number of values tallied for variable. Otherwise, returns time over which variable has been accumulated.

(variable-sum variable)  real?
  variable : variable?
Returns the sum of the values of variable.

(variable-mean variable)  real?
  variable : variable?
Returns the mean of the values of variable.

(variable-variance variable)  real?
  variable : variable?
Returns the variance of the values of variable.

(variable-standard-deviation variable)  real?
  variable : variable?
Returns the standard deviation of the values of variable.

To create continuous variables, see Chapter 10, Continuous Simulation Models.

8.1.2 Tally and Accumulate

The tally and accumulate macros specify the automatic data collection for a variable.

(tally (variable-statistics variable))
(tally (variable-history variable))
Specifies time-independent data collection for the specified variable. variable-statistics specifies that statistics are to be tallied for variable. variable-history specifies that a history is to be tallied for variable.

Whenever the value of variable is changed, any tallied data collectors are updated with the new value.

Specified time-dependent data collection for the specified variable. variable-statistics specifies that statistics are to be accumulated for variable. variable-history specifies that a history is to be accumulated for variable.

Whenever the value of variable is accessed or before its value is changed, any accumulated data collectors are synchronized with the current value over the time since it was last synchronized.

8.1.3 Variable Statistics

(struct statistics (time-dependent?
    minimum
    maximum
    n
    sum
    sum-of-squares)
  #:mutable)
  time-dependent? : boolean?
  minimum : real?
  maximum : real?
  n : (>=/c 0)
  sum : real?
  sum-of-squares : real?
The statistics structure maintains statistics for a variable.

The following table shows the statistics that are gathered and how they are computed for both tally and accumulate.

Statistic

tally

accumulate

n

number of samples of X

timeC - time0

sum

Σ X

Σ (X × (timeC - timeL))

mean

sum/n

sum/n

sum-of-squares

Σ X2

Σ (X2 × (timeC - timeL))

mean-square

sum-of-squares/n

sum-of-squares/n

variance

mean-square - mean2

mean-square - mean2

standard-deviation

variance½

variance½

minimum

minimum X for all X

minimum X for all X

maximum

maximum X for all X

maximum X for all X

where,

(statistics-accumulate! statistics    
  value    
  time)  any
  statistics : statistics?
  value : real?
  time : (>=/c 0.0)
Accumulates the statistics with the value at the specified time.

(statistics-tally! statistics value)  any
  statistics : statistics?
  value : real?
Tallies the statistics with the value.

(statistics-mean statistics)  real?
  statistics : statistics?
Returns the mean of the values in statistics.

(statistics-mean-square statistics)  real?
  statistics : statistics?
Returns the means of the squares of the values in statistics.

(statistics-variance statistics)  real?
  statistics : statistics?
Returns the sample variance of the values in statistics.

(statistics-standard-deviation statistics)  real?
  statistics : statistics?
Returns the standard deviation of the values in statistics.

8.1.4 Variable History

(struct history (time-dependent?
    n
    values
    last-value-cell
    durations
    last-duration-cell)
  #:mutable)
  time-dependent? : boolean?
  n : exact-nonnegative-integer?
  values : mlist?
  last-value-cell : (or/c mpair? false/c)
  durations : mlist?
  last-duration-cell : (or/c pair? false/c)
The history structure maintains a history of the values for a variable. For accumulated histories (i.e., those specified using the accumulate macro), the durations for each value are also computed.

(history-accumulate! history value time)  any
  history : history?
  value : real?
  time : (>=/c 0.0)
Accumulates the history with the value at the specified time.

(history-tally! history value)  any
  history : history?
  value : real?
Tallies the history with the value.

8.1.5 History Graphics

(history-plot history [title])  any
  history : history?
  title : string? = "History"
Plots history using the specified title. The string "History" is used if title is not specified.

8.1.6 Variable Monitors

Variable monitors are discussed in Chapter 9 Monitors.

8.2 Example—Tally and Accumulate Example

This example show how the tally and accumulate macros work. Two variables are created: tallied and accumulated. Statistics and history data are collected for each—using tally for the variable tallied and accumulate for the variable accumulated. The process test-process iterates through a list of values and durations, setting each of the variables to the specified value for the specified duration of time. Representative statistice, n, sum, and mean, are printed and the histories plotted for each of the variables.

#lang racket/base
; Test Tally and Accumulate
 
(require (planet williams/simulation/simulation-with-graphics))
 
(define tallied #f)
(define accumulated #f)
 
(define-process (test-process value-duration-list)
  (for ((vdl (in-list value-duration-list)))
    (let ((value (car vdl))
          (duration (cadr vdl)))
      (set-variable-value! tallied value)
      (set-variable-value! accumulated value)
      (wait duration))))
 
(define (main value-duration-list)
  (with-new-simulation-environment
    (set! tallied (make-variable))
    (tally (variable-statistics tallied))
    (tally (variable-history tallied))
    (set! accumulated (make-variable))
    (accumulate (variable-statistics accumulated))
    (accumulate (variable-history accumulated))
    (schedule #:at 0.0 (test-process value-duration-list))
    (start-simulation)
    (printf "--- Test Tally and Accumulate ---~n")
    (printf "~n--- Tally ---~n")
    (printf "N    = ~a~n" (variable-n tallied))
    (printf "Sum  = ~a~n" (variable-sum tallied))
    (printf "Mean = ~a~n" (variable-mean tallied))
    (printf "~a~n"
            (history-plot (variable-history tallied)))
    (printf "~n--- Accumulate ---~n")
    (printf "N    = ~a~n" (variable-n accumulated))
    (printf "Sum  = ~a~n" (variable-sum accumulated))
    (printf "Mean = ~a~n" (variable-mean accumulated))
    (printf "~a~n"
            (history-plot (variable-history accumulated)))))
 
(main '((1 2)(2 1)(3 2)(4 3)))

Here are the results of running the program for the following value/duration pairs: ((1 2) (2 1) (3 2) (4 3)). That is, each variable will have a value of 1 for 2 units of time (fromr time 0 to time 2), a value of 2 for 1 unit of time (from time 2 to time 3), a value of 3 for 2 units of time (from time 3 to time 5), and a value of 4 for 3 units of time (from time 5 to time 8). The simulation ends at time 8.

The following is the resulting output.

--- Test Tally and Accumulate ---

 

--- Tally ---

N    = 4

Sum  = 10.0

Mean = 2.5

--- Accumulate ---

N    = 8.0

Sum  = 22.0

Mean = 2.75

8.3 Example—Data Collection

The examples in previous chapters (Examples 0, 1, and 2) relied on printf statements to print the output of the simulation model. This was sufficient to show how the models worked, but would be impractical for large models. This example is the same simulation model as Example 2 from Chapter 7, Resources (using with-resource instead of the individual calls to request and relinquish), but uses automatic data collection to collect data.

No explicit variables are needed for this example since resources already provide variables for their satisfied and queue fields—since they are, in turn, implemented using sets. (See Chapter 10, Sets.)

Note that the statements:

(accumulate (variable-statistics
             (resource-queue-variable-n attendant)))
(accumulate (variable-statistics
             (resource-satisfied-variable-n attendant)))

are not required since statistics are accumulated for any variable by default. [Although it would work the same if one or both were included.]

racket/base
; Example 3 - Data Collection
 
(require (planet williams/simulation/simulation-with-graphics))
 
(define n-attendants 2)
(define attendant #f)
 
(define-process (generator n)
  (for ((i (in-range n)))
    (wait (random-exponential 4.0))
    (schedule #:now (customer i))))
 
(define-process (customer i)
  (with-resource (attendant)
    (work (random-flat 2.0 10.0))))
 
(define (run-simulation n)
  (with-new-simulation-environment
   (set! attendant (make-resource n-attendants))
   (schedule #:at 0.0 (generator n))
   (accumulate (variable-statistics (resource-queue-variable-n attendant)))
   (accumulate (variable-history (resource-queue-variable-n attendant)))
   (start-simulation)
   (printf "--- Example 3 - Data Collection ---~n")
   (printf "Maximum queue length = ~a~n"
           (variable-maximum (resource-queue-variable-n attendant)))
   (printf "Average queue length = ~a~n"
           (variable-mean (resource-queue-variable-n attendant)))
   (printf "Variance             = ~a~n"
           (variable-variance (resource-queue-variable-n attendant)))
   (printf "Utilization          = ~a~n"
           (variable-mean (resource-satisfied-variable-n attendant)))
   (printf "Variance             = ~a~n"
           (variable-variance (resource-satisfied-variable-n attendant)))
   (write-special (history-plot (variable-history
                                 (resource-queue-variable-n attendant))))
   (newline)))
 
(run-simulation 1000)

The following is the resulting output.

--- Example 3 - Data Collection ---

Maximum queue length = 8

Average queue length = 0.9120534884951139

Variance             = 2.2453788694193957

Utilization          = 1.4320511974417858

Variance             = 0.5885107114317054

This is the first useful example we’ve shown in the sense that we simulate enough customers to be meaningful and provide statistical output of the simulation.

A few things to note here:

8.4 Data Collection Across Multiple Simulation Runs

Even as simplistic as our example simulation model is, it is still useful in illustrating some advanced data collection techniques. In particular, in this section we will show how to collect statistics across multiple simulation runs.

8.4.1 Open Loop Processing

Open loop processing is a technique where a resource is considered to have an infinite number of units available for allocation. That is, no process will ever block waiting for such a resource. Statistics on the demand for such resources can be collected by looking at the resources-satistied-variable-n variable. Typically, this is done in a Monte Carlo fashion across multiple simulation runs.

In the simulation collection, we denote an open-loop resource by specifying an infinite number of units when the resource is created. In Racket, +inf.0 denotes positive infinity and is used in specifying an open-loop resource.

8.4.2 Example—Open Loop Processing

This example collects statistics on the maximum number of attendants required in the system (a measure of demand) when there is no blocking.

There is an outer simulation environment that exists solely for data collection and a variable max-attendants to gather statistics on the maximum number of attendants required. Note that these statistics must be tallied at this level because (simulated) time does not exist across multiple simulation runs.

The inner loop created a new simulation environment for each simulation run. This ensures that each run is properly initialized. It is in this inner loop that the attendant resource is created with an infinite number of units using (make-resource +inf.0). When the simulation in the inner loop terminates, the max-attendants variable (in the outer loop) is updated with the maximum number of attendants from the simulation. This is done with:

(set-variable-value! max-attendants
  (variable-maximum
   (resource-satisfied-variable-n attendant)))

Finally, the statistics and histogram of the maximum attendants accross all of the simulation runs is printed.

#lang racket/base
; Open Loop Example
 
(require (planet williams/simulation/simulation-with-graphics))
 
(define attendant #f)
 
(define (generator n)
  (for ((i (in-range n)))
    (wait (random-exponential 4.0))
    (schedule #:now (customer i))))
 
(define-process (customer i)
  (with-resource (attendant)
    (wait/work (random-flat 2.0 10.0))))
 
(define (run-simulation n1 n2)
  (with-new-simulation-environment
   (let ((max-attendants (make-variable)))
     (tally (variable-statistics max-attendants))
     (tally (variable-history max-attendants))
     (for ((i (in-range n1)))
       (with-new-simulation-environment
        (set! attendant (make-resource +inf.0))
        (schedule #:at 0.0 (generator n2))
        (start-simulation)
        (set-variable-value! max-attendants
         (variable-maximum (resource-satisfied-variable-n attendant)))))
     (printf "--- Open Loop Example ---~n")
     (printf "Number of experiments      = ~a~n"
             (variable-n max-attendants))
     (printf "Minimum maximum attendants = ~a~n"
             (variable-minimum max-attendants))
     (printf "Maximum maximum attendants = ~a~n"
             (variable-maximum max-attendants))
     (printf "Mean maximum attendants    = ~a~n"
             (variable-mean max-attendants))
     (printf "Variance                   = ~a~n"
             (variable-variance max-attendants))
     (write-special (history-plot (variable-history max-attendants)
                                  "Maximum Attendants"))
     (newline))))
 
(run-simulation 1000 1000)

The following shows the output of the simulation for 1000 runs of 1000 customers each.

--- Open Loop Example ---

Number of experiments      = 1000

Minimum maximum attendants = 6

Maximum maximum attendants = 11

Mean maximum attendants    = 7.525

Variance                   = 0.6653749999999903

This can be interpreted as saying that in order to service all customers with no wait time for any customer, a minimum of six and a maximum of eleven attendants were requires, with a mean of approximately 7.5.

Note the use of write-special to output the history plot (instead of the more convenient printf). This will produce the graphical plot when the output is directed to an editor canvas in GRacket as well as in DrRacket. The (newline) call performs the same function as the ~n in printf.

8.4.3 Closed Loop Processing

Closed loop processing is the "normal" processing mode in a simulation model where the number of units of a unit is specified and processes are queued (i.e., blocked) when there are not sufficient units of a resource to satisfy a request. Statistics on the contention for such resources can be collected by looking at the resource-queue-variable-n variable. Typically, this is done across multiple simulation runs.

8.4.4 Example—Closed Loop Processing

This example collects statistics on the average attendant queue length in the system (a measure of contention) when there is a specified number of attendants.

There is an outer simulation environment rhat exists solely for data collection and a variable avg-queue-length to gather statistics on the average attendant queue length. Note that the statistics must be tallied at this level because (simulated) time does not exist across multiple simulation runs.

The inner loop created a new simulation environment for each simulation rn. This ensures that each run is properly initialized. It is in this inner loop that the attendant resource is created with the specified number of units using (make-resource n-attendants). When the simulation in the inner loop terminates, the avg-queue-length variable (in the outer loop) is updated with the average attendant queue length from the simulation. This is done with:

(set-variable-value! avg-queue-length
  (variable-mean
   (resource-queue-variable-n attendant)))

Finally, the statistics and histogram of the average attendant queue length across all of the simulation runs is printed.

#lang racket/base
; Closed Loop Example
 
 
(require (planet williams/simulation/simulation-with-graphics))
 
(define n-attendants 2)
(define attendant #f)
 
(define-process (generator n)
  (for ((i (in-range n)))
    (wait (random-exponential 4.0))
    (schedule #:now (customer i))))
 
(define-process (customer i)
  (with-resource (attendant)
    (work (random-flat 2.0 10.0))))
 
(define (run-simulation n1 n2)
  (let ((avg-queue-length (make-variable)))
    (tally (variable-statistics avg-queue-length))
    (tally (variable-history avg-queue-length))
    (for ((i (in-range n1)))
      (with-new-simulation-environment
       (set! attendant (make-resource n-attendants))
       (schedule #:at 0.0 (generator n2))
       (start-simulation)
       (set-variable-value! avg-queue-length
         (variable-mean (resource-queue-variable-n attendant)))))
    (printf "--- Closed Loop Example ---~n")
    (printf "Number of attendants         = ~a~n" n-attendants)
    (printf "Number of experiments        = ~a~n"
            (variable-n avg-queue-length))
    (printf "Minimum average queue length = ~a~n"
            (variable-minimum avg-queue-length))
    (printf "Maximum average queue length = ~a~n"
            (variable-maximum avg-queue-length))
    (printf "Mean average queue length    = ~a~n"
            (variable-mean avg-queue-length))
    (printf "Variance                     = ~a~n"
            (variable-variance avg-queue-length))
    (print (history-plot (variable-history avg-queue-length)
                                 "Average Queue Length"))
    (newline)))
 
(run-simulation 1000 1000)

The following shows the output of the simulation for 1000 runs of 1000 customers each.

--- Closed Loop Example ---

Number of attendants         = 2

Number of experiments        = 1000

Minimum average queue length = 0.5792057912006373

Maximum average queue length = 3.182757214703683

Mean average queue length    = 1.1123279920475524

Variance                     = 0.08869696318792064

This shows that with two attendants, on average, over 1000 runs of 1000 customers, there were 1.1 people in the queue waiting for an attendant.