Data Collection
The purpose of most simulation models is to collect data to analyze to gain insights into the system being simulated. The PLT Scheme Simulation Collection, (numeric) data subject to automatic data collection is stored in variable structures (i.e. variables).
Data about a variable may either be collected in a time dependent manner, specified using the accumulate
macro, or in time independent manner, specified using the tally
macro.
Currently, either statistics data or history data may be automatically collected for a variable. (Both may in turn be either time dependent or time independent.) History data allows more sophisticated analysis to be performed on the data using other analysis tools. Also, a function to plot history data is provided.
8.1 Variables
A variable represents a numeric variable in the model for which data can automatically be collected, as specified by the model builder.
8.1.1 The variable Structure
Structure:
variable 
Contract: (struct variable ((initialvalue (union/c (symbols uninitialized) real?)) (value (union/c (symbols uninitialized) real?)) (timelastsynchronized real?) (statistics (union/c statistics? #f)) (history (union/c history? #f)) (continuous? boolean?) (stateindex (integerin 1 +inf.0)) (getmonitors list?) (setmonitors list?))) 

variable
structure represent variables in the simulation model. The variable
structure has the following fields.
Function:
(makevariable initialvalue) (makevariable) 
Contract: (case> (> real? variable?) (> variable?)) 

'uninitialized
is used.
By default, all variables accumulate statistics on their values. To turn this off, set the statistics
field to #f.
To create continuous variables, see Chapter 10 Continuous Simulation Models.
8.2 Tally and Accumulate
The tally
and accumulate
macros specify data collection for variables.
8.2.1 Tally
tally
Macro:
(tally (variablestatistics variable)) (tally (variablehistory variable)) 

variablestatistics
specifies that statistics are to be tallied for variable. variablehistory
specifies that a history is to be tallied for variable.Each time a variable value is changed, any tallied data collectors are updated with the new value.
8.2.2 Accumulate
accumulate
Macro:
(accumulate (variablestatistics variable)) (accumulate (variablehistory variable)) 

variablestatistics
specifies that statistics are to be accumulated for variable. variablehistory
specifies that a history is to be accumulated for variable.Each time a variable data collector is accessed or before a variable value is changed, any accumulated data collectors are synchronized with the current value over the time since it was last synchronized.
8.3 Statistics and Histories
8.3.1 Statistics
Structure:
statistics 
Contract: (struct statistics ((timedependent? boolean) (minimum real?) (maximum real?) (n real?) (sum real?) (sumofsquares real?))) 

statistics
structure maintains statistics for a variable. Table 1 shows the statistics that are gathered and how they are computed for both tally
and accumulate
.
Table 1 shows the statistics collected and how they are computed for both tallied and accumulated data collectors.

time_{C} = current simulation time
time_{L} = simulation time variable was set to its current value
time_{0} = simulation time the variable was created
X = variable value before change occurs
8.3.2 History
Structure:
history 
Contract: (struct history ((timedependent? boolean) (initialtime real?) (n real?) (values list?) (lastvaluecell (union/c pair? #f)) (durations list?) (lastdurationcell (union/c pair? #f)))) 

history
structure maintains a history of the values of a variable. For accumulated histories (i.e. those specified using the accumulate
macro), the durations for each value are also computed.
8.3.2.1 History Graphics
Function:
(historyplot history title) (historyplot history) 
Contract: (case> (> history? string? any) (> history? any)) 

"History"
is used if title is not specified.
8.3.3 Example  Tally and Accumulate Example
This example shows how the tally
and accumulate
macros work. Two variables are created, tallied and accumulated. Statistics and history data are collected for each  using tally
for the variable tallied and accumulate
for the variable accumulated. The process testprocess
iterates through a list of values and durations, setting each of the variables to the specified value for the specified duration of time. Representative statistics (n
, sum
, and mean
) are printed and the histories plotted for each of the variables.
;; Test Tally and Accumulate (require (planet "simulationwithgraphics.ss" ("williams" "simulation.plt"))) (define tallied #f) (define accumulated #f) (defineprocess (testprocess valuedurationlist) (let loop ((vdl valuedurationlist)) (when (not (null? vdl)) (let ((value (caar vdl)) (duration (cadar vdl))) (setvariablevalue! tallied value) (setvariablevalue! accumulated value) (wait duration) (loop (cdr vdl)))))) (define (main valuedurationlist) (withnewsimulationenvironment (set! tallied (makevariable)) (tally (variablestatistics tallied)) (tally (variablehistory tallied)) (set! accumulated (makevariable)) (accumulate (variablestatistics accumulated)) (accumulate (variablehistory accumulated)) (schedule (at 0.0) (testprocess valuedurationlist)) (startsimulation) (printf " Test Tally and Accumulate ~n") (printf "~n Tally ~n") (printf "N = ~a~n" (variablen tallied)) (printf "Sum = ~a~n" (variablesum tallied)) (printf "Mean = ~a~n" (variablemean tallied)) (printf "~a~n" (historyplot (variablehistory tallied))) (printf "~n Accumulate ~n") (printf "N = ~a~n" (variablen accumulated)) (printf "Sum = ~a~n" (variablesum accumulated)) (printf "Mean = ~a~n" (variablemean accumulated)) (printf "~a~n" (historyplot (variablehistory accumulated)))))
Here are the results of running the program for the following value, duration pairs: ((1 2)(2 1)(3 2)(4 3)). That is, each variable will have a value of 1 for 2 units of time (from time 0 to time 2), a value of 2 for 1 unit of time (from time 2 to time 3), a value of 3 for 2 units of time (from time 3 to time 5), and a value of 4 for 3 units of time (from time 5 to time 8). The simulation ends at time 8.
>(main '((1 2)(2 1)(3 2)(4 3)))  Test Tally and Accumulate   Tally  N = 4 Sum = 10.0 Mean = 2.5
 Accumulate  N = 8.0 Sum = 22.0 Mean = 2.75
>
8.3.4 Variable Monitors
Variable monitors are discussed in Chapter ?? Monitors.
8.4 Example  Data Collection
The previous examples (Examples 0, 1, and 2) relied on printf
statements to print the output of the simulation model. This was sufficient to show how the models worked, but would be impractical for large models. This example is the same simulation model as Example 2 (using the withresource
instead of the individual calls to resourcerequest
and resourcerelinquish
), but with the printf
statements removed.
No explicit variables are needed for this example since resources already provide variables for their satisfied
and queue
fields  since they are in turn implemented using sets.
Note that the statement:
(accumulate (variablestatistics (resourcequeuevariablen attendant)))
isn't actually needed since statistics are accumulated for any variable by default. It is included as an example. Note that the corresponding accumulate
is not included for the satisfied
field and the statistics are still available.
; Example 3  Data Collection (require (planet "simulationwithgraphics.ss" ("williams" "simulation.plt"))) (require (planet "randomdistributions.ss" ("williams" "science.plt"))) (define nattendants 2) (define attendant #f) (defineprocess (generator n) (do ((i 0 (+ i 1))) ((= i n) (void)) (wait (randomexponential 4.0)) (schedule now (customer i)))) (defineprocess (customer i) (withresource (attendant) (work (randomflat 2.0 10.0)))) (define (runsimulation n) (withnewsimulationenvironment (set! attendant (makeresource nattendants)) (schedule (at 0.0) (generator n)) (accumulate (variablestatistics (resourcequeuevariablen attendant))) (accumulate (variablehistory (resourcequeuevariablen attendant))) (startsimulation) (printf " Example 3  Data Collection ~n") (printf "Maximum queue length = ~a~n" (variablemaximum (resourcequeuevariablen attendant))) (printf "Average queue length = ~a~n" (variablemean (resourcequeuevariablen attendant))) (printf "Variance = ~a~n" (variablevariance (resourcequeuevariablen attendant))) (printf "Utilization = ~a~n" (variablemean (resourcesatisfiedvariablen attendant))) (printf "Variance = ~a~n" (variablevariance (resourcesatisfiedvariablen attendant))) (print (historyplot (variablehistory (resourcequeuevariablen attendant))))))
Here is the output for the example when run for 1000 customers.
>(runsimulation 1000)  Example 3  Data Collection  Maximum queue length = 8 Average queue length = 0.9120534884951139 Variance = 2.2420855874934826 Utilization = 1.4320511974417858 Variance = 0.5885107114317054
>
8.5 Data Collection Across Multiple Simulation Runs
Even as simplistic as our example has been, it is still useful in illustrating some advanced data collection techniques. In particular, we will show how to collect statistics across multiple runs.
8.5.1 Open Loop Processing
Open Loop processing is a technique where a resource is considered to have an infinite number of units. That is, no process will ever block waiting for such a resource. Statistics on the demand for such resources can be collected by looking at the resourcesatisfiedvariablen
variable. Typically, this is done across multiple simulation runs.
In the simulation collection we denote an openloop resource by specifying an infinite number of units when it is created. In PLT Scheme, +inf.0
denoted (positive infinity).
8.5.1.1 Example  Open Loop Processing
This example collects statistics on the maximum number of attendants required in the system (e.g. a measure of demand) when there is no blocking.
There is an outer simulation environment that exists solely for data collection and a variable maxattendants
to gather statistics on the maximum number of attendants required. Note that these statistics must be tallied at this level because (simulated) time does not exist across multiple simulation runs.
The inner loop creates a new simulation environment for each simulation run. This ensures each run is properly initialized. It is in this inner loop that the attendant resource is create with an infinite number of units  (makeresource +inf.0)
. When the simulation in the inner loop terminates, the maxattendants
variable is updated with the maximum number of attendants from the simulation. This is done with:
(setvariablevalue! maxattendants (variablemaximum (resourcesatisfiedvariablen attendant)))
Finally, the statistics and histogram of the maximum attendants across all of the simulation runs is printed.
; Open Loop Example (require (planet "simulationwithgraphics.ss" ("williams" "simulation.plt"))) (require (planet "randomdistributions.ss" ("williams" "science.plt"))) (define attendant #f) (define (generator n) (do ((i 0 (+ i 1))) ((= i n) (void)) (wait (randomexponential 4.0)) (schedule now (customer i)))) (defineprocess (customer i) (withresource (attendant) (wait/work (randomflat 2.0 10.0)))) (define (runsimulation n1 n2) (withnewsimulationenvironment (let ((maxattendants (makevariable))) (tally (variablestatistics maxattendants)) (tally (variablehistory maxattendants)) (do ((i 1 (+ i 1))) ((> i n1) (void)) (withnewsimulationenvironment (set! attendant (makeresource +inf.0)) (schedule (at 0.0) (generator n2)) (startsimulation) (setvariablevalue! maxattendants (variablemaximum (resourcesatisfiedvariablen attendant))))) (printf " Open Loop Example ~n") (printf "Number of experiments = ~a~n" (variablen maxattendants)) (printf "Minimum maximum attendants = ~a~n" (variableminimum maxattendants)) (printf "Maximum maximum attendants = ~a~n" (variablemaximum maxattendants)) (printf "Mean maximum attendants = ~a~n" (variablemean maxattendants)) (printf "Variance = ~a~n" (variablevariance maxattendants)) (print (historyplot (variablehistory maxattendants) "Maximum Attendants")) (newline))))
The following shows the output of the simulation for 1000 run of 1000 customers each.
>(runsimulation 1000 1000)  Open Loop Example  Number of experiments = 1000 Minimum maximum attendants = 6 Maximum maximum attendants = 11 Mean maximum attendants = 7.525 Variance = 0.6653749999999903
>
8.5.2 Closed Loop Processing
Closed Loop processing is the "normal" processing where the number of units of a resource is specified and processes are queued (i.e. blocked) when there are not sufficient units of the resource to satisfy a request. Statistics on the utilitization for such resources can be collected by looking at the resourcequeuevariablen
variable. Typically, this is done across multiple simulation runs.
8.5.2.1 Example  Closed Loop Processing
This example collects statistics on the average attendant queue length in the system (e.g. a measure of utilization) when there is a specified number of attendants.
There is an outer simulation environment that exists solely for data collection and a variable avgqueuelength
to gather statistics on the average attendant queue length. Note that these statistics must be tallied at this level because (simulated) time does not exist across multiple simulation runs.
The inner loop creates a new simulation environment for each simulation run. This ensures each run is properly initialized. It is in this inner loop that the attendant resource is create with the specified number of units  (makeresource nattendants)
. When the simulation in the inner loop terminates, the avgqueuelength
variable is updated with the average attendant queue length the simulation. This is done with:
(setvariablevalue! avgqueuelength (variablemean (resourcequeuevariablen attendant)))
Finally, the statistics and histogram of the average attendant attendant queue length across all of the simulation runs is printed.
; Closed Loop Example (require (planet "simulationwithgraphics.ss" ("williams" "simulation.plt"))) (require (planet "randomdistributions.ss" ("williams" "science.plt"))) (define nattendants 2) (define attendant #f) (defineprocess (generator n) (do ((i 0 (+ i 1))) ((= i n) (void)) (wait (randomexponential 4.0)) (schedule now (customer i)))) (defineprocess (customer i) (withresource (attendant) (work (randomflat 2.0 10.0)))) (define (runsimulation n1 n2) (let ((avgqueuelength (makevariable))) (tally (variablestatistics avgqueuelength)) (tally (variablehistory avgqueuelength)) (do ((i 1 (+ i 1))) ((> i n1) (void)) (withnewsimulationenvironment (set! attendant (makeresource nattendants)) (schedule (at 0.0) (generator n2)) (startsimulation) (setvariablevalue! avgqueuelength (variablemean (resourcequeuevariablen attendant))))) (printf " Closed Loop Example ~n") (printf "Number of attendants = ~a~n" nattendants) (printf "Number of experiments = ~a~n" (variablen avgqueuelength)) (printf "Minimum average queue length = ~a~n" (variableminimum avgqueuelength)) (printf "Maximum average queue length = ~a~n" (variablemaximum avgqueuelength)) (printf "Mean average queue length = ~a~n" (variablemean avgqueuelength)) (printf "Variance = ~a~n" (variablevariance avgqueuelength)) (print (historyplot (variablehistory avgqueuelength) "Average Queue Length")) (newline)))
The following shows the output of the simulation for 1000 run of 1000 customers each.
>(runsimulation 1000 1000)  Closed Loop Example  Number of attendants = 2 Number of experiments = 1000 Minimum average queue length = 0.5792057912006373 Maximum average queue length = 3.182757214703683 Mean average queue length = 1.1123279920475524 Variance = 0.08869696318792064
>