1 Interface
uuid?
uuid-RFC-4122?
uuid-version
uuid=?
uuid<?
uuid>?
hex-string?
hex-string->uuid
uuid-string?
string->uuid
uuid->hex-string
uuid->string
uuid->urn-string
nil-uuid
namespace-DNS
namespace-URL
namespace-OID
namespace-X500
1.1 Time-Based (Type 1) UUIDs
make-uuid-1
uuid-1->date
1.2 Name-Based (Types 3 and 5) UUIDs
make-uuid-3
make-uuid-5
1.3 Pseudo-Random (Type 4) UUIDs
make-uuid-4
2 Example
3 Issues and Comments

Universally Unique Identifiers (UUIDs)

by Doug Williams

m.douglas.williams at gmail.com

This library provides Universally Unique Identifiers (UUIDs) as defined in RFC 4122. A UUID is a 128-bit value that is externally encoded as a string in 8-4-4-4-12 format. This library provides functions for constructing time-based (type 1), name-based using MD5 hashing (type 3), (pseudo-)random (type 4), and name-based using SHA-1 hashing (type 5) UUIDs. A copy of RFC 4122 is included with this library.

An example of a time-based (type 1) UUID is:

f81d4fae-7dec-11d0-a765-00a0c91e6bf6

which was generated on Monday, February 3, 1997 at 5:43:12 PM GMT on a machine with an IEEE 802 MAC address of 00-a0-c9-1e-6b-f6.

The UUID library is available from the PLaneT repository.

 (require (planet williams/uuid/uuid))

    1 Interface

      1.1 Time-Based (Type 1) UUIDs

      1.2 Name-Based (Types 3 and 5) UUIDs

      1.3 Pseudo-Random (Type 4) UUIDs

    2 Example

    3 Issues and Comments

1 Interface

The UUID library provides the following functions:

(uuid? x)  boolean?
  x : any/c
Returns #t if x is a UUID.

(uuid-RFC-4122? uuid)  boolean?
  uuid : uuid?
Returns #t if uuid is a UUID of the variant defined by RFC 4122. Note that the functions in this library only create and manipulate this variant of UUID.

Returns the version (i.e., type, or more accurately, sub-type) of this UUID. The current known versions are:

Version

Description

1

The time-based version specified in RFC 4122.

2

DCE Security version, with embedded POSIX UIDs.

3

The name-based version specified in RFC 4122 that uses MD5 hashing.

4

The randomly or pseudo-randomly generated version specified in RFC 4122.

5

The name-based version specified in RFC 4122 that uses SHA-1 hashing.

The version is more accurately a sub-type, but the term is retained for compatibility.

The following routines compare UUIDs by treating each as equivalent to an unsigned 128-bit integer.

(uuid=? uuid-1 uuid-2)  boolean
  uuid-1 : uuid?
  uuid-2 : uuid?
Returns #t if uuid-1 = uuid-2.

(uuid<? uuid-1 uuid-2)  boolean
  uuid-1 : uuid?
  uuid-2 : uuid?
Returns #t if uuid-1 < uuid-2.

(uuid>? uuid-1 uuid-2)  boolean
  uuid-1 : uuid?
  uuid-2 : uuid?
Returns #t if uuid-1 > uuid-2.

(hex-string? x)  boolean?
  x : any/c
Returns #t if x is a hexadecimal string.

(hex-string->uuid hex-string)  uuid?
  hex-string : hex-string?
Returns the UUID represented by hex-string. An error is raised if hex-string is not exactly 32 characters in length.

(uuid-string? x)  boolean?
  x : any/c
Returns #t if x is a string representing a UUID. The string representation of a UUID may be a 32-character hexadecimal string, a 36-character string in 8-4-4-4-12 format, or a 45-characters UUID URN ("urn:uuid:" prepended to an 8-4-4-4-12 formatted UUID).

(string->uuid string)  (or/c uuid? false/c)
  string : string?
Returns the UUID represented by string or #f if the string is not a UUID.

Examples:

(string->uuid "f81d4fae7dec11d0a76500a0c91e6bf6")
#<uuid f81d4fae-7dec-11d0-a765-00a0c91e6bf6>

(string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6")
#<uuid f81d4fae-7dec-11d0-a765-00a0c91e6bf6>

(uuid->hex-string uuid)  string?
  uuid : uuid?
Returns uuid as a 32-character hexadecimal string.

Example:

(define UUID (string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"))
(uuid->hex-string UUID)
"f81d4fae7dec11d0a76500a0c91e6bf6"

(uuid->string uuid)  string?
  uuid : uuid?
Returns uuid as a 36-character string in 8-4-4-4-12 format.

Example:

(define UUID (string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"))
(uuid->string UUID)
"f81d4fae-7dec-11d0-a765-00a0c91e6bf6"

(uuid->urn-string uuid)  string?
  uuid : uuid?
Returns uuid as a Uniform Resource Name (URN), which is "urn:uuid:" prepended to the 8-4-4-4-12 formatted value of uuid.

Example:

(define UUID (string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"))
(uuid->urn-string UUID)
"urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6"

The nil UUID is a special form of UUID that is specified to have all 128 bits set to zero.

nil-uuid
#<uuid 00000000-0000-0000-0000-000000000000>

The following identifiers are bound to predefined UUIDs that represent specific name spaces that are used to generate name-based UUIDs.

Used to generate name-based UUIDs from Domain Name Space (DNS) names.

namespace-DNS
#<uuid 6ba7b810-9dad-11d1-80b4-00c04fd430c8>

Used to generate name-based UUIDs from Uniform Resource Locators (URLs).

namespace-URL
#<uuid 6ba7b811-9dad-11d1-80b4-00c04fd430c8>

Used to generate name-based UUIDs from ISO Object IDs (OIDs).

namespace-OID
#<uuid 6ba7b812-9dad-11d1-80b4-00c04fd430c8>

Used to generate name-based UUIDs from X.500 Distinquished Names (DNs).

namespace-X500
#<uuid 6ba7b814-9dad-11d1-80b4-00c04fd430c8>

1.1 Time-Based (Type 1) UUIDs

A time-based (type 1) UUID uses the current time in number of 100 nanosecond intervals since 00:00:00.00 UTC, 10 October 1582 (60 bits), a clock sequence number to help avoid duplicates (14 bits), and the IEEE 801 MAC address (48 bits) to generate a unique identifier. Note that the current time field will not rollover until around A.D. 3400.

Returns a time-based (type 1) UUID as specified in RFC 4122. The current time has a resolution of milliseconds, that is, the least-significant four decimal digits are always zero. This implementation does no maintain any state information on the UUID generation and always uses a random clock sequence number. This is fine for low volume UUID generation (e.g., tens per millisecond). The primary MAC address of the host computer is used. If this cannot be determined, a random broadcast MAC address is used (47 random bits plus the broadcast bit set), which cannot clash with the MAC address of any real hardware device.

Example:

(make-uuid-1)
#<uuid d2177dd0-eaa2-11de-a572-001b779c76e3>

(uuid-1->date uuid)  date?
  uuid : uuid?
Returns the date and time that a time-based (type 1) UUID was created.

Note that the following examples were run with the locale set to MST (GMT-7).

Examples:

(define UUID (string->uuid "d2177dd0-eaa2-11de-a572-001b779c76e3"))
(date->string (uuid-1->date UUID) #t)
"Wednesday, December 16th, 2009 5:26:29pm"

(define UUID (string->uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"))
(date->string (uuid-1->date UUID) #t)
"Monday, February 3rd, 1997 10:43:12am"

1.2 Name-Based (Types 3 and 5) UUIDs

The version 3 or 5 UUID is meant for generating UUIDs from names that are drawn from, and unique within, some name space. The concept of name and name space should be broadly construed, and not limited to textual names. For example, some name spaces are the domain name space, URLs, ISO Object IDs (OIDs), X.500 Distinquished Names (DNs), and reserved words in a programming language.

Name-based UUIDs may be generated using either MD5 hashing, for type 3 UUIDs, or SHA-1 hashing, for type 5 UUIDs. If backward compatibility is not an issue, SHA-1 is preferred.

Note that there is an apparent error in the RFC 4122 specification. (See http://www.rfc-editor.org/errata_search.php?rfc=4122.) Specifically, the reference implementation swaps the eight octets 0..3, 4..5, and 6..7 twice, for the name space UUID and for the MD5 output, as foreseen for little endian input, but the values are already big endian - that is, only one swap is needed. Most implementations (e.g., the Unix uuid command and Python library) used the corrected implementation, but some others have not. We have added a Boolean-valued #:legacy keyword to specify which result to compute: #f for the corrected version or #t for the original (i.e., ’buggy’) version. The default is the corrected version.

(make-uuid-3 namespace-uuid    
  name    
  [#:legacy legacy?])  uuid?
  namespace-uuid : uuid?
  name : string?
  legacy? : boolean? = #f
Returns a name-based (type 3) UUID as specified in RFC 4122 using MD5 hashing. If legacy? is #t, then the output value matches that of the original (’buggy’) RFC 4122 reference implementation. This should only be used when compability with a known buggy implementation is required.

Examples:

(make-uuid-3 namespace-DNS "www.widgets.com")
#<uuid 3d813cbb-47fb-32ba-91df-831e1593ac29>

(make-uuid-3 namespace-DNS "www.widgets.com" #:legacy #t)
#<uuid e902893a-9d22-3c7e-a7b8-d6e313b71d9f>

(make-uuid-5 namespace-uuid    
  name    
  [#:legacy legacy?])  uuid?
  namespace-uuid : uuid?
  name : string?
  legacy? : boolean? = #f
Returns a name-based (type 5) UUID as specified in RFC 4122 using SHA-1 hashing. If legacy? is #t, then the output value matches that of the original (’buggy’) RFC 4122 reference implementation. This should only be used when compability with a known buggy implementation is required.

Examples:

(make-uuid-5 namespace-DNS "www.widgets.com")
#<uuid 21f7f8de-8051-5b89-8680-0195ef798b6a>

(make-uuid-5 namespace-DNS "www.widgets.com" #:legacy #t)
#<uuid 13726f09-44a9-5eeb-8910-3525a23fb23b>

1.3 Pseudo-Random (Type 4) UUIDs

A type 4 UUID is created using pseudo-random numbers. The resulting 128-bit UUID contains 122 random bit plus 2 bits specifying the variant (RFC 4122) and 4 bits specifying the version (4).
Returns a pseudo-random (type 4) UUID.

Example:

(make-uuid-4)
#<uuid 177f42e6-6f22-44d9-93f6-8c475170daf6>

2 Example

The following example demonstrates various functions of the UUIS library.

  #lang scheme
  
  (require scheme/date)
  (require "uuid.ss")
  
  ; Time-Based UUIDs
  (define U1 (make-uuid-1))
  (printf "(make-uuid-1)~n~a~n"
          U1)
  (printf "Created ~a~n~n"
          (date->string (uuid-1->date U1) #t))
  
  ; Name-Based UUID Using MD5 Hashing
  (printf
   "(make-uuid-3 namespace-DNS \"www.widgets.com\")~n~a~n"
   (make-uuid-3 namespace-DNS "www.widgets.com"))
  (printf
   "(make-uuid-3 namespace-DNS \"www.widgets.com\" #:legacy #t)~n~a~n~n"
   (make-uuid-3 namespace-DNS "www.widgets.com" #:legacy #t))
  
  ; Name-Based UUID Using SHA-1 Hashing
  (printf
   "(make-uuid-5 namespace-DNS \"www.widgets.com\")~n~a~n"
   (make-uuid-5 namespace-DNS "www.widgets.com"))
  (printf
   "(make-uuid-5 namespace-DNS \"www.widgets.com\" #:legacy #t)~n~a~n~n"
   (make-uuid-5 namespace-DNS "www.widgets.com" #:legacy #t))
  
  ; (Pseudo-)Random UUID
  (define U4 (make-uuid-4))
  (printf "(make-uuid-4)~n~a~n~n" U4)
  
  (printf "U4 = ~a~n~n" U4)
  (printf "(uuid->string U4)~n~s~n~n" (uuid->string U4))
  (printf "(uuid->urn-string U4)~n~s~n~n" (uuid->urn-string U4))
  
  ; Comparisons
  (printf "namespace-DNS = ~a~n" namespace-DNS)
  (printf "(uuid=? U4 U4) = ~a~n" (uuid=? U4 U4))
  (printf "(uuid=? U4 namespace-DNS) = ~a~n" (uuid=? U4 namespace-DNS))
  (printf "(uuid<? U4 namespace-DNS) = ~a~n" (uuid<? U4 namespace-DNS))
  (printf "(uuid>? U4 namespace-DNS) = ~a~n" (uuid>? U4 namespace-DNS))

Produces the following output.

(make-uuid-1)

#<uuid 7c769460-eac0-11de-a1ca-001b779c76e3>

Created Wednesday, December 16th, 2009 8:58:51pm

 

(make-uuid-3 namespace-DNS "www.widgets.com")

#<uuid 3d813cbb-47fb-32ba-91df-831e1593ac29>

(make-uuid-3 namespace-DNS "www.widgets.com" #:legacy #t)

#<uuid e902893a-9d22-3c7e-a7b8-d6e313b71d9f>

 

(make-uuid-5 namespace-DNS "www.widgets.com")

#<uuid 21f7f8de-8051-5b89-8680-0195ef798b6a>

(make-uuid-5 namespace-DNS "www.widgets.com" #:legacy #t)

#<uuid 13726f09-44a9-5eeb-8910-3525a23fb23b>

 

(make-uuid-4)

#<uuid ab595962-0a37-4520-8bef-afc559955201>

 

U4 = #<uuid ab595962-0a37-4520-8bef-afc559955201>

 

(uuid->string U4)

"ab595962-0a37-4520-8bef-afc559955201"

 

(uuid->urn-string U4)

"urn:uuid:ab595962-0a37-4520-8bef-afc559955201"

 

namespace-DNS = #<uuid 6ba7b810-9dad-11d1-80b4-00c04fd430c8>

(uuid=? U4 U4) = #t

(uuid=? U4 namespace-DNS) = #f

(uuid<? U4 namespace-DNS) = #f

(uuid>? U4 namespace-DNS) = #t

3 Issues and Comments

The biggest issue is that of the ’buggy’ reference implementation in RFC 4122 with regard to the generation of name-based UUIDs. It seems that the implementation in the Unix uuid command and the Python library (among others) is the correct implementation and we use this as the default behavior. However, it also seems that there are implementations ’in the wild’ that match the original RFC 4122 reference implementation. Therefore, we also provide this behavior using the #:legacy keyword.

The current time is measured at millisecond accuracy, which means we lose a significant amount of the available address space – 5 decimal digits. The advantage is a simple, portable implementation. This is fine for low-volume UUID generation.

The current implementation does not maintain any state information for UUID generation. This means that we generate a new random clock sequence for every new time-based UUID, which increases the probability of collisions. Again, this is fine for low-volume UUID generation.

At some point, (make-uuid-1) needs to allow the optional specification of the node to use. Currently, this is the primary MAC address for the machine on which the code is run, which could be considered a security issue in some cases.