5 S3 (Storage)

(require (planet gh/aws:1:=2/s3))

AWS S3 provides a fairly simple and REST-ful interface. Putting an object to S3 is a simple HTTP PUT request. Getting an object is a simple GET request. And so on. As a result, you may feel you don’t need a lot of “wrapper” around this.

Where you definitely will want help is in constructing the Authorization header S3 uses to authenticate requests. This requires making a string out of specific elements of your request and “signing” it with your AWS private key. Even a small discrepancy will cause the request to fail authentication. As a result, aws/s3 makes it easy for you to create the authentication header correctly and successfully.

Plus, aws/s3 does provide wrappers and tries to help with some wrinkles. For example, S3 may give you a 302 redirect when you do a PUT or POST. You don’t want to transmit the entire entity, only to have S3 ignore it and you have to transmit it all over again. Instead, you want to supply the request header Expect: 100-continue, which lets S3 respond before you transmit the entity.

5.1 Endpoint

parameter
(s3-host) → string?
(s3-host v) → void?
v : string?

The hostname used for the S3 REST API. Defaults to "s3.amazonaws.com".

parameter
(s3-scheme) → (or/c "http" "https")
(s3-scheme v) → void?
v : (or/c "http" "https")

The scheme used for the S3 REST API. Defaults to "http". Set to "https" to connect using SSL.

5.2 Authentication signatures

procedure
(bucket&path->uri bucket path-to-resource) → string?
bucket : string?
path-to-resource : string?

Given bucket and path (both of which should not start with a leading "/"), use s3-scheme and s3-host to make the URI for the resource.

Example:

> (bucket&path->uri "bucket" "path/to/file")
"http://bucket.s3.amazonaws.com/path/to/file"

procedure
(bucket+path->bucket&path&uri b+p) →
string? string? string?
b+p : string?

Given a combined bucket+path string such as "bucket/path/to/resource", return the bucket portion, path portion and URI.

Example:

> (bucket+path->bucket&path&uri "bucket/path/to/file")
"bucket"
"path/to/file"
"http://bucket.s3.amazonaws.com/path/to/file"

procedure
(uri&headers b+p method headers) →
string? dict?
  b+p : string?
  method : string?
  headers : dict?

Return the URI and headers for which to make an HTTP request to S3. Constructs an Authorization header based on the inputs.

5.3 Conveniences

procedure
(create-bucket bucket-name) → void?
bucket-name : string?

Create a bucket named bucket-name.

Keep in mind that bucket names on S3 are global—shared among all users of S3. You may want to make your bucket names include a domain name that you own.

If you try to create a bucket with a name that is already used by another AWS account, you will get a 409 Conflict response.

If you create a bucket that already exists under your own account, this operation is idempotent (it doesn’t cause an error, it’s simply a no-op).

procedure
(delete-bucket bucket-name) → void?
bucket-name : string?

Delete a bucket named bucket-name.

This operation is idempotent (it is a no-op to delete a bucket that has already been deleted).

procedure
(list-buckets) → (listof string?)

List all the buckets belonging to your AWS account.

procedure
(ls bucket+path) → (listof string?)
bucket+path : string?

List the names of objects whose names start with the pathname bucket+path (which is the form "bucket/path/to/resource").

procedure
(ll bucket+path) → (listof (list/c string? string? xexpr?))
bucket+path : string?

List objects whose names start with the path in bucket+path (which is the form "bucket/path/to/resource"):

name (as with ls)
response headers from a HEAD request
an xexpr representing the ACL for the object

procedure
(head bucket+path) → string?
bucket+path : string?

Make a HEAD request for bucket+path (which is the form "bucket/path/to/resource") and return the headers as a string in net/head format.

bucket+path is the form "bucket/path/to/resource".

procedure
(delete bucket+path) → void?
bucket+path : string?

Make a DELETE request to delete bucket+path (which is the form "bucket/path/to/resource")

procedure
(copy bucket+path/from bucket+path/to) → string?
bucket+path/from : string?
bucket+path/to : string?

Tip: To rename an object, copy it then delete the original.

Copy an existing S3 object bucket+path/from to bucket+path/to, including its metadata. Both names are of the form "bucket/path/to/resource".

It is not an error to copy to an existing object (it will be replaced). It is even OK to copy an existing object to itself.

procedure
(get-acl bucket+path [heads]) → xexpr?
bucket+path : string?
heads : dict? = '()

Make a GET request for the ACL of racket+path (which is the form "bucket/path/to/resource"). (In the REST API, this is done simply by appending an "?acl" query parameter.)

S3 responds with an XML representation of the ACL, which is returned as an xexpr?.

procedure
(get bucket+path
reader
[ heads
range-begin
range-end]) → any/c
  bucket+path : string?
  reader : (input-port? string? -> any/c)
  heads : dict? = '()
  range-begin : (or/c #f exact-nonnegative-integer?) = #f
  range-end : (or/c #f exact-nonnegative-integer?) = #f

Although you may use get directly, it is also a building block for other procedures that you may find more convenient, such as get/bytes and get/file.

Make a GET request for bucket+path (which is the form "bucket/path/to/resource").

The reader procedure is called with an input-port? and a string? respresenting the response headers. The reader should read the response entity from the port, being careful to read the exact number of bytes as specified in the response header’s Content-Length field. The return value of reader is the return value of get.

You may pass request headers in the optional heads argument.

The optional arguments range-begin and range-end are used to supply an HTTP Range request header. This header, which Amazon S3 supports, enables a getting only a subset of the bytes. Note that range-end is exclusive to be consistent with the Racket convention, e.g. subbytes. (The HTTP Range header specifies the end as inclusive, so your range-end argument is decremented to make the value for the header.)

procedure
(get/bytes bucket+path
[ heads
range-begin
range-end]) → bytes?
  bucket+path : string?
  heads : dict? = '()
  range-begin : (or/c #f exact-nonnegative-integer?) = #f
  range-end : (or/c #f exact-nonnegative-integer?) = #f

Make a GET request for bucket+path (which is the form "bucket/path/to/resource") and return the response entity as bytes?.

You may pass request headers in the optional heads argument.

The optional arguments range-begin and range-end are used to supply an optional Range request header. This header, which Amazon S3 supports, enables a getting only a subset of the bytes. Note that range-end is exclusive to be consistent with the Racket convention, e.g. subbytes. (The HTTP Range header specifies the end as inclusive, so your range-end argument is decremented to make the value for the header.)

The ETag response header from S3 is an MD5 checksum, and error will be called if the received bytes do not match the MD5 checksum. (This only happens if the entire object is requested. Otherwise—if the range-begin and range-end are not #f—then the checksum is ignored.)

The response entity is held in memory; if it is very large and you want to "stream" it instead, consider using get.

procedure
(get/file bucket+path
pathname
[ heads
#:mode mode-flag
#:exists exists-flag]) → void?
  bucket+path : string?
  pathname : path-string?
  heads : dict? = '()
  mode-flag : (or/c 'binary 'text) = 'binary
   exists-flag : (or/c 'error 'append 'update 'replace 'truncate 'truncate/replace)
= 'error

Make a GET request for bucket+path (which is the form "bucket/path/to/resource") and copy the the response entity directly to the file specified by pathname. The keyword arguments #:mode and #:exists are identical to those for call-with-output-file*.

The ETag response header from S3 is an MD5 checksum, and error will be called if the received bytes do not match the MD5 checksum.

You may pass request headers in the optional heads argument.

procedure
(put bucket+path
writer
data-length
mime-type
reader
[ heads]) → void?
  bucket+path : string?
  writer : (output-port . -> . void?)
  data-length : (or/c #f exact-nonnegative-integer?)
  mime-type : string?
  reader : (input-port? string? . -> . any/c)
  heads : dict? = '()

Although you may use put directly, it is also a building block for other procedures that you may find more convenient, such as put/bytes and put/file.

Makes a PUT request for bucket+path (which is the form "bucket/path/to/resource"), using the writer procedure to write the request entity and the reader procedure to read the response entity. Returns the response header (unless it raises exn:fail:aws).

The writer procedure is given an output-port? and a string? representing the response headers. It should write the request entity to the port. The amount written should be exactly the same as data-length, which is used to create a Content-Length request header. You must also supply mime-type (for example "text/plain") which is used to create a Content-Type request header.

The reader procedure is the same as for get. The response entity for a PUT request usually isn’t interesting, but you should read it anyway.

Note: If you want a Content-MD5 request header, you must calculate and supply it yourself in heads.

procedure
(put/bytes bucket+path data mime-type [heads]) → void?
  bucket+path : string?
  data : bytes?
  mime-type : string?
  heads : dict? = '()

Makes a PUT request for bucket+path (which is the form "bucket/path/to/resource"), sending data as the request entity and creating a Content-Type header from mime-type. Returns the response header (unless it raises exn:fail:aws).

A Content-MD5 request header is automatically created from data. To ensure data integrity, S3 will reject the request if the bytes it receives do not match the MD5 checksum.

procedure
(put/file bucket+path
pathname
[ #:mime-type mime-type
#:mode mode-flag]) → void?
  bucket+path : string?
  pathname : path-string?
  mime-type : (or/c #f string?) = #f
  mode-flag : (or/c 'binary 'text) = 'binary

Makes a PUT request for bucket+path (which is the form "bucket/path/to/resource") and copy the the request entity directly from the file specified by pathname. The #:mode-flag argument is identical to that for call-with-input-file*, which is used. Returns the response header (unless it raises exn:fail:aws).

If #:mime-type is #f, then the Content-Type header is guessed from the file extension, using a (very short!) list of common extensions. If no match is found, then "application/x-unknown-content-type" is used. You can customize the MIME type guessing by setting the path->mime-proc parameter to your own procedure.

A Content-MD5 request header is automatically created from the contents of the file represented by path. To ensure data integrity, S3 will reject the request if the bytes it receives do not match the MD5 checksum.

A Content-Disposition request header is automatically created from path. For example if path is "/foo/bar/test.txt" or "c:\\foo\\bar\\test.txt" then the header "Content-Disposition:attachment; filename=\"test.txt\"" is created. This is helpful because a web browser that is given the URI for the object will propmt the user to download it as a file.

parameter
(path->mime-proc) → procedure?
(path->mime-proc proc) → void?
proc : procedure?

A procedure which takes a path-string? and returns a string? with a MIME type.

5.4 S3 examples

(require (planet gh/aws/keys)
         (planet gh/aws/s3))

(define (member? x xs)
  (not (not (member x xs))))

;; Make a random name for the bucket. Remember bucket names are a
;; global space shared by all AWS accounts. In a real-world app, if
;; you have a domain name, you probably want to include that as part
;; of your name.
(define test-bucket
  (for/fold ([s "test.bucket."])
      ([x (in-range 32)])
    (string-append s
                   (number->string (truncate (random 15)) 16))))

(ensure-have-keys)

(create-bucket test-bucket)
(member? test-bucket (list-buckets))

(define test-pathname "path/to/file")
(define b+p (string-append test-bucket "/" test-pathname))

(define data #"Hello, world.")
(put/bytes b+p data "text/plain")
(get/bytes b+p)
(get/bytes b+p '() 0 5)
(head b+p)

(ls (string-append test-bucket "/"))
(ls (string-append test-bucket "/" test-pathname))
(ls (string-append test-bucket "/" (substring test-pathname 0 2)))

(define p (build-path 'same
                      "tests"
                      "s3-test-file-to-get-and-put.txt"))
(put/file b+p p #:mime-type "text/plain")
(get/file b+p p #:exists 'replace)
(head b+p)
(member? test-pathname (ls b+p))

(define b+p/copy (string-append b+p "-copy"))
(copy b+p b+p/copy)
(ls (string-append test-bucket "/"))
(head b+p/copy)
(delete b+p/copy)

(delete b+p)
(delete-bucket test-bucket)

← prev up next →

1	Introduction
2	Names
3	AWS Keys
4	Exception handling
5	S3 (Storage)
6	SDB (Database)
7	SES (Email)
8	SNS (Notifications)
9	SQS (Queues)
10	Cloud Watch (Monitoring)
11	Utilities
12	Unit tests
13	License

5.1	Endpoint
5.2	Authentication signatures
5.3	Conveniences
5.4	S3 examples