Skip to content

Latest commit

 

History

History
1284 lines (956 loc) · 46.6 KB

REST-API.md

File metadata and controls

1284 lines (956 loc) · 46.6 KB

REST API

Hatrac (pronounced "hat rack") is a simple object storage service for web-based, data-oriented collaboration. It presents a simple HTTP REST API with the following characteristics:

Contents

Main Topics

This documentation is broken down into the following general topics:

  1. URL Conventions
  2. Resource Types Overview
  3. Root Namespace
  4. Nested Namespaces
  5. Objects
  6. Object Versions
  7. Metadata
  8. Access Control Lists
  9. Chunked Uploads

Quick Links to Operations

The REST API supports the following operations.

  1. Namespace operations
  1. Object operations
  1. Object version operations
  1. Metadata management operations
  1. Access control list operations
  1. Chunked upload operations

URL Conventions

Any Hatrac URL is a valid HTTP URL and contains user-generated content which may need to be escaped. Several reserved characters from RFC 3986 are used as meta-syntax in Hatrac and MUST be escaped if they are meant to be part of a user-generated name value and MUST NOT be escaped if they are meant to indicate the Hatrac meta-syntax:

  • The '/' or forward-slash, used as a path separator character
  • The ':' or colon, used to separate object names from version IDs
  • The ';' or semi-colon, used to separate resource names and sub-resource keywords

All other reserved characters should be escaped in user-generated content in URLs, but have no special meaning to Hatrac when appearing in unescaped form.

Resource Naming and Lifecycle Rules

The hierarchical Hatrac naming model defines three main types of resource:

  1. Namespace
  2. Object
  3. Object Version

The model supports a hierarchy of nested namespaces while objects and their versions appear only at the leaves of the tree. A particular hierarchical name has a three-phase lifecycle that can oscillate in the latter two phases:

  1. Undefined, until the name is bound
  2. Defined, from the moment the name is bound until deleted
  • Namespaces: a name bound as a namespace will always be a namespace if available to access.
  • Objects: a name bound as an object will always be an object if available to access.
  1. Deleted, from the moment the name is deleted until restored to its previous definition.

An implementation MAY permit restoration of a deleted namespace or object, but it is not required to do so. An implementation MUST prevent other reuse of names. The definition of restoration is that all of the following hold:

  • The name is defined as the same type of resource that was previous defined prior to deleted;
  • At the moment of restoration, the parent namespace encoded in the name is still a defined namespace;
  • When an object name is restored, the constraints on object version naming continue to hold as if the object had never been deleted;
  • When a namespace is restored, any child namespace or object MAY be restored and MAY remain deleted. Any child restoration MUST follow the same restoration rules, recursively.

Object Version Naming

A particular object name can be qualified with a version identifier having a three-phase lifecycle that can oscillate on the latter two phases:

  1. Undefined, until the version identifier is issued to a content value
  2. Defined, from the moment a version is created until the version is deleted
  3. Unavailable, from the moment a version is deleted until it is restored with the same content value

Hatrac allows object versions to be deleted in order to reclaim backing storage resources, as an alternative to simply making versions unavailable by restricting their access control settings.

Referential Stability

A particular namespace or object name denotes the same abstract container from the point of definition into the indefinite future, but that container can change. Namespaces can gain or lose children (nested namespaces and objects) and objects can gain or lose object versions.

A particular object version reference (name plus version identifier) is a permanent, immutable reference to a specific content value. Such a referenced content value MAY become unavailable if the object version is deleted or its access control rules are restrictive. However, when and if it becomes available, it MUST always denote the same content value for all clients able to access it at any point in time. To guarantee this stability for clients while giving some freedom to implementers, the following rules are defined for version identifiers:

  • Version identifiers are opaque, local qualifiers for a specific object name. An implementation MAY use globally distinct version identifiers but clients SHOULD NOT compare version identifiers associated with different object names.
  • Any two distinct content values applied as updates to the same object MUST be issued distinct version identifiers. Hence, any two reference URLs with identical object name and version identifier MUST denote the same content value.
  • Any two updates applied to the same object with identical content value are subject to more complex rules depending on the sequencing of operations:
    1. If the first update yields an object version which is deleted prior to the second update operation
    • The implementation MAY reuse the same version identifier used for the previously deleted object version which denoted the same content value.
    • The implementation MAY issue a distinct version identifier for each object version that has non-overlapping lifetimes while denoting the same content value.
    1. If the first update yields an object version which is still defined prior to the second update operation
    • The second operation MAY fail with a conflict if the implementation does not support storage and tracking of duplicate content values
    • The second operation MAY issue a new, distinct version identifier
    1. Simultaneous update must be logically resolved as if one of the updates occurred before the other, satisfying the preceding rules.

These rules allow a broad range of implementation techniques while preventing collaboration hazards such as unstable references denoting different data values at different times or data value collisions causing ambiguous object ownership and privileges.

Root Namespace Resource

The root of a Hatrac deployment is an HTTPS URL of the form:

  • https:// authority / prefix

Where authority is a DNS hostname and optional port number, and prefix is a '/' separated string of any length. A deployment MAY use a fixed path to route HTTP requests to Hatrac alongside other services in the same authority (host and port), or it MAY use an empty prefix if the entire HTTP namespace of the authority is dedicated to Hatrac resources.

In all documentation below, the "/ prefix" is considered to be part of the parent path URL path elements. Therefore every example URL will be a hierarchical name starting with a "/" character.

Namespace Listing Retrieval

The GET operation is used to list direct children of a namespace:

GET /parent_path/namespace_id
Host: authority_name
If-None-Match: etag_value

for which a successful response is a JSON array of child resource URLs:

200 OK
Content-Type: application/json
Content-Length: N
ETag: etag_value

["/hatrac/parent_path/namespace_id/child1", ...]

If the text/uri-list content-type is negotiated, the response will be a whitespace separated list of child URLs:

200 OK
Content-Type: text/uri-list
Content-Length: N
ETag: etag_value

/hatrac/parent_path/namespace_id/child1
/hatrac/parent_path/namespace_id/child2
...

Namespace Listing Metadata Retrieval

The HEAD operation is used to get basic status information:

HEAD /parent_path/namespace_id
Host: authority_name

for which a successful response is:

200 OK
Content-Type: application/json
Content-Length: N

indicating that an N byte JSON representation is available. This operation is essentially equivalent to the GET operation but with the actual child listing elided.

Deletion of Root Namespace Forbidden

The root namespace of a Hatrac deployment SHOULD forbid delete operations. It is nonsensical to have a Hatrac deployment without a root namespace.

Nested Namespace Resources

Any hierarchical namespace in Hatrac has an HTTPS URL of the form:

  • https:// authority / parent path / namespace id

Where parent path is the name of the enclosing namespace and namespace id is the relative name of the nested namespace. Of course, the enclosing namespace may be the root namespace of the deployment, e.g. /hatrac, or another nested namespace, e.g. /hatrac/some/ancestors.

Nested Namespace Creation

The PUT operation is used to create a new nested namespace:

PUT /parent_path/namespace_id
Host: authority_name
Content-Type: application/x-hatrac-namespace

which may also be modified with the parents query parameter:

PUT /parent_path/namespace_id?parents=true
Host: authority_name
Content-Type: application/x-hatrac-namespace

to request automatic creation of missing ancestors in parent path. In either case, a successful response is:

201 Created
Location: /parent_path/namespace_id
Content-Type: text/uri-list
Content-Length: N

/parent_path/namespace_id

Note: see related object resource interface for pragmatic discussion of the use of Content-Type to disambiguate namespace and object creation requests.

Typical PUT error responses would be:

  • 401 Unauthorized: the client is not authenticated and anonymous creation of such a namespace is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to create or update the namespace.
  • 404 Not Found: the parent namespace does not exist and parents=true query parameter was not supplied to request automatic creation of missing ancestors.
  • 409 Conflict: the namespace cannot be created due to a conflict with existing state of the service:
    • The parent path does not denote a namespace
    • The namespace already exists

Nested Namespace Listing Retrieval

The same GET and HEAD operations documented above for the Root Namespace Resource can also list direct children of any nested namespace.

For nested namespaces, typical GET or HEAD error responses would be:

  • 404 Not Found: the name does not map to an available resource on the server.

Note: since nested namespaces and objects share the same hierarchical name structure, a GET operation on a name might resolve to an object rather a namespace. As such, error responses applicable to an object might be encountered as well.

Nested Namespace Deletion

The DELETE operation is used to delete a nested namespace

DELETE /parent_path/namespace_id
Host: authority_name

for which a successful response is:

204 No Content

An implementation SHOULD NOT allow deletion of non-empty namespaces. It is RECOMMENDED that deletion of non-empty namespaces be rejected, but an implementation MAY treat it as a bulk request by the same client to delete everything contained in the namespace prior to deleting the namespace itself. It is further RECOMMENDED that such a deletion be processed atomically, considering all client privileges in advance, but an implementation MAY partially delete contents before failing due to an authorization error on some subset of contents.

Typical DELETE error responses would be:

  • 401 Unauthorized: the client is not authenticated and anonymous deletion of such a resource is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to delete the resource.
  • 404 Not Found: the name does not denote an existing resource.
  • 409 Conflict: the resource cannot be deleted at this time, i.e. because the namespace is not empty.

Object Resources

Any unversioned object name in Hatrac has an HTTPS URL of the form:

  • https:// authority / namespace path / object name

Where namespace path is the name of the enclosing namespace and object name is the relative name of the object.

Object Creation and Update

The PUT operation is used to create a new object or a new version of an existing object. Literal object content is provided as input:

PUT /namespace_path/object_name
Host: authority_name
Content-Type: text/plain
Content-Length: 14
Content-MD5: ZXS/CYPMeEBJpBYNGYhyjA==
Content-SHA256: 5+aEMqzlEZxe9xPaDUZ0GyBvTUaZf4s0yMpPgV/0yt0=
Content-Disposition: filename*=UTF-8''test.txt
If-Match: etag_value
If-None-Match: *

...content...

This example has metadata consistent with an object containing a single Unix-style text line ...content...\n inclusive of the line-terminator.

As with nested namespace creation, an optional query parameter may be included to enable automatic namespace creation:

PUT /namespace_path/object_name?parents=true
...

The optional If-Match and If-None-Match headers MAY be specified to limit object update to specific scenarios. In a normal situation, only one of these two headers is specified in a single request:

  • An etag value with the If-Match header requires that the current version of the object on the server match the version indicated by the etag value in order for the object to be updated as per the request content.
  • An * with the If-None-Match header requires that the object lack a current version on the server in order for the object to be created or updated as per the request content.

Without either If-Match or If-None-Match headers in the request, the update will be unconditionally applied if allowed by policy and the current state of the server. When supplied, they select HTTP standard conditional request processing.

The optional Content-MD5 and Content-SHA256 headers can carry an MD5 or SHA-256 hash value, respectively. The hash value SHOULD be the base64 encoded representation of the underlying bit sequence defined by the relevant hash algorithm standard. Either or both, if supplied, will be stored and returned with data retrieval responses, useful for end-to-end data integrity checks by clients. An implementation MAY checksum the supplied content and reject the request if it mismatches any supplied hash value or if any hash value is malformed. An implementation MAY recognize and accept hex-encoded hash value or MAY reject it as a bad request, but in either case it MUST always return proper base64-encoded hash value in any service-issued Content-MD5 or Content-SHA256 response header.

The optional Content-Disposition header will be stored and returned with data retrieval responses. An implementation MAY restrict which values are acceptable as content disposition instructions. Every implementation SHOULD support the filename*=UTF-8'' filename syntax where filename is a basename with no path separator characters. According to the web standards, the filename component embedded in this header MUST be UTF-8 which is then URL-escaped (percent-encoded) on an octet by octet basis, just like URL components in this REST API.

A successful response is:

201 Created
Location: /namespace_path/object_name:version_id
Content-Type: text/uri-list
Content-Length: N

/namespace_path/object_name:version_id

The successful response includes the version id qualified name of the newly updated object.

Typical PUT error responses would be:

  • 400 Bad Request: the client supplied a Content-MD5 header with a hash value that does not match the entity content which was recieved.
  • 401 Unauthorized: the client is not authenticated and anonymous creation of such an object is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to create the object.
  • 404 Not Found: the parent path does not exist and the parents=true query parameter was not supplied to request automatic creation of missing ancestors.
  • 409 Conflict: the object cannot be created due to a conflict with existing state of the service:
    • The namespace path may not denote a namespace
    • The object name may already be in use as a namespace, therefore preventing its use as an object.
  • 412 Precondition Failed: the object cannot be created or updated due to its current state on the server not meeting the requirements indicated by the If-Match and/or If-None-Match request headers.

Note: There is ambiguity in the meaning of a URL when creating a new object or nested namespace because they have the same syntactic structure. Hatrac disambiguates such requests in a decision process:

  1. If the full path denotes an existing object, the PUT request MUST denote a request to update the content of the existing object, regardless of what Content-Type is present.
  2. If Content-Type is application/x-hatrac-namespace, the PUT request MUST denote a request to create a new nested namespace.
  3. Any other PUT request not matching the above is considered an object creation request.

This set of rules makes it simple to create any common object or namespace. In the degenerate case where one wishes to create an object with content that looks exactly like a namespace request input, the solution is to first create an empty object (e.g. with Content-Type: text/plan) and then immediately update its content with the desired content.

Object Retrieval

The GET operation is used to retrieve the current version of an object:

GET /namespace_path/object_name
Host: authority_name
Accept: *
If-None-Match: etag_value

The optional If-None-Match header MAY supply an ETag value obtained from a previous retrieval operation, to inform the server that the client already has a copy of a particular version of the object.

for which a successful response is:

200 OK
Content-Type: content_type
Content-Length: N
Content-MD5: hash_value
Content-SHA256: hash_value
Content-Disposition: filename*=UTF-8''filename
Content-Location: /namespace_path/object_name:version
ETag: etag_value

...content...

The optional Content-MD5, Content-SHA256, and Content-Disposition headers MUST be present if supplied during object creation and MAY be present if the service computes missing values in other cases. The Content-Location header SHOULD be present and specifies the URL for the version of the object which was retrieved.

It is RECOMMENDED that a Hatrac server return an ETag indicating the version of the content returned to the client.

Typical GET error responses would be:

  • 304 Not Modified: the etag value supplied in the If-None-Match header matches the current object version on the server.
  • 401 Unauthorized: the client is not authenticated and anonymous retrieval of such an object is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to retrieve the object.
  • 404 Not Found: the name does not denote a defined object.
  • 409 Conflict: the object cannot be retrieved at this time, e.g. there are currently no object versions defined.

Object Metadata Retrieval

The HEAD operation is used to retrieve information about the current version of an object:

HEAD /namespace_path/object_name
Host: authority_name
Accept: *

for which a successful response is:

200 OK
Accept-Ranges: bytes
Content-Type: content_type
Content-Length: N
Content-MD5: hash_value
Content-SHA256: hash_value
Content-Disposition: filename*=UTF-8''filename
Content-Location: /namespace_path/object_name:version

The HEAD operation is essentially equivalent to the GET operation but with the actual object content elided.

Object Version List Retrieval

The GET operation is used to list versions of an object:

GET /namespace_path/object_name;versions
Host: authority_name

for which a successful response is a JSON array of version resource URLs:

200 OK
Content-Type: application/json
Content-Length: N

["/hatrac/namespace_path/object_name:version_id", ...]

If the text/uri-list content-type is negotiated, the response is a whitespace separated list of version URLs:

200 OK
Content-Type: text/uri-list
Content-Length: N

/hatrac/namespace_path/object_name:version1_id
/hatrac/namespace_path/object_name:version2_id
...

Object Deletion

The DELETE operation is used to delete an object

DELETE /namespace_path/object_name
Host: authority_name
If-Match: etag_value

The optional If-Match header MAY be specified to prevent object deletion unless the current object version on the server matches the version indicated by the etag value.

for which a successful response is:

204 No Content

An implementation SHOULD NOT allow deletion of objects with existing object versions. It is RECOMMENDED that an implementation treat deletion as a bulk request by the same client to delete all versions of the object prior to deleting the object itself, however an implementation MAY signal an error when object versions exist. It is further RECOMMENDED that such a deletion be processed atomically, considering all client privileges in advance, but an implementation MAY partially delete versions before failing due to an authorization error on some subset of the versions.

Typical DELETE error responses would be:

  • 401 Unauthorized: the client is not authenticated and anonymous deletion of such a resource is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to delete the resource.
  • 404 Not Found: the name does not denote an existing resource.
  • 409 Conflict: the resource cannot be deleted at this time, i.e. because object versions still exist.
  • 412 Precondition Failed: the deletion was aborted because the current object version on the server does not match the version indicated by the If-Match request header.

Object Version Resources

Any versioned object name in Hatrac has an HTTPS URL of the form:

  • https:// authority / namespace path / object name : version id

Where version id is the service-issued identifier for a particular version of the named object.

Object Version Creation

See the previous section on Object Creation and Update. Object versions are created by performing an update on the unversioned object URL.

Object Version Retrieval

A particular version of an object can be retrieved using the GET operation whether or not it is the current version of the object:

GET /namespace_path/object_name:version_id
Host: authority_name
If-None-Match: etag_value

for which the successful response is:

200 OK
Content-Type: content_type
Content-MD5: hash_value
Content-SHA256: hash_value
Content-Disposition: filename*=UTF-8''filename
Content-Length: N
ETag: etag_value

...content...

with the same interpretation as documented for Object Retrieval above.

The ETag and If-None-Match headers allow client-side caching of object versions. Because a Hatrac object version is immutable, the etag value for a given object version SHOULD NOT change over its lifetime.

Object Version Metadata Retrieval

Metadata for a particular version of an object can be retrieved using the HEAD operation whether or not it is the current version of the object:

HEAD /namespace_path/object_name:version_id
Host: authority_name

for which the successful response is:

200 OK
Accept-Ranges: bytes
Content-Type: content_type
Content-MD5: hash_value
Content-SHA256: hash_value
Content-Disposition: filename*=UTF-8''filename
Content-Length: N

with the same interpretation as documented for Object Metadata Retrieval above.

Object Version Deletion

The DELETE operation is used to delete an object version

DELETE /namespace_path/object_name:version_id
Host: authority_name

for which a successful response is:

204 No Content

For completeness in the protocol, an If-Match header MAY be specified to control deletion of object versions, but it is redundant since object versions are immutable and their content cannot be in a different state than observed on a previous access.

Typical DELETE error responses would be:

  • 401 Unauthorized: the client is not authenticated and anonymous deletion of such a resource is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to delete the resource.
  • 404 Not Found: the name does not denote an existing resource.
  • 412 Precondition Failed: the If-Match request header was specified with an etag value which does not match this object version.

Versions of objects can be deleted whether or not they are the current version:

  • Deletion of any version MUST make that version unavailable.
  • Deletion of any version MAY permanently discard content. An implementation MAY retain deleted content to allow restoration procedures not documented here.
  • Deletion of the current version will cause the next most recent version of the object to become its new current version.
  • An object may be left empty, i.e. with no current version, if all versions have been deleted. A subsequent update can reintroduce content for the object.

Metadata Sub-Resources

The service also exposes sub-resources for metadata management on existing object versions:

  • https:// authority / resource name ;metadata
  • https:// authority / resource name ;metadata/ fieldname

Where resource name is currently restricted to object version names as described above. The fieldname is a lower-case string which matches an HTTP request header suitable for describing content metadata. The currently recognized fieldnames include:

  • content-type
  • content-disposition
  • content-md5
  • content-sha256

Lifecycle and Ownership

Metadata are sub-resources of the main resource identified in the resource name in the URL, and their lifetime is bounded by the lifetime of that main resource.

  1. Initial metadata MAY be specified during object creation and update.
  2. Immutable checksums MAY be added on existing object versions.
  3. Mutable metadata MAY be added, removed, or modified on existing object versions.

Metadata Collection Retrieval

The GET operation is used to retrieve all metadata sub-resources en masse as a document:

GET /resource_name;metadata
Host: authority_name
Accept: application/json
If-None-Match: etag_value

for which the successful response is:

200 OK
Content-Type: application/json
Content-Length: N
ETag: etag_value

{"content-type": content_type, 
 "content-md5": hash_value,
 "content-sha256": hash_value,
 "content-disposition": disposition}

The standard object version metadata retrieval, operation uses the HEAD method on the main resource to retrieve this same metadata as HTTP response headers.

Metadata Value Retrieval

The GET operation is used to retrieve one metadata sub-resource as a text value:

GET /resource_name;metadata/fieldname
Host: authority_name
Accept: text/plain
If-None-Match: etag_value

for which the successful response is:

200 OK
Content-Type: text/plain
Content-Length: N
ETag: etag_value

value

The textual value is identical to what would be present in the HTTP response header value when retrieving the main resource content.

Metadata Value Creation and Update

The PUT operation is used to create or update one metadata sub-resource as a text value:

PUT /resource_name;metadata/fieldname
Host: authority_name
Content-Type: text/plain
If-Match: etag_value

value

for which the successful response is:

204 No Content

The textual value is identical to what would be present in the HTTP request header value when creating the main resource content.

Metadata Value Deletion

The DELETE operation is used to create or update one metadata sub-resource as a text value:

DELETE /resource_name;metadata/fieldname
Host: authority_name

for which the successful response is:

204 No Content

Access Control List Sub-Resources

An access control policy sub-resource has an HTTPS URL of the form:

  • https:// authority / resource name ;acl
    • (for a list of all ACLs on the resource)
  • https:// authority / resource name ;acl/ access
    • (for a specific ACL)
  • https:// authority / resource name ;acl/ access / entry
    • (for a specific ACL entry)

Where resource name is a namespace, object, or object version name as described above, access is a specific access mode that is applicable to that type of resource, and entry is a specific role name or the * wildcard. The full set of access control lists for each resource type is:

  • Namespace
    • owner: lists roles considered to be owners of the namespace.
    • create: lists roles permitted to create new children in the namespace.
    • read: lists roles permitted to read the list of child names in the namespace.
    • subtree-owner: lists roles considered to be owners of any namespace, object, or object version beneath the namespace.
    • subtree-create: lists roles permitted to create new children of the namespace or of any namespace beneath the namespace.
    • subtree-update: lists roles permitted to update data on any object beneath the namespace.
    • subtree-read: lists roles permitted to read any object version or list children of any namespace beneath the namespace.
  • Object
    • owner: lists roles considered to be owners of the object.
    • update: lists roles permitted to update object with new versions.
    • read: lists roles permitted to list versions of the named object.
    • subtree-owner: lists roles considered to be owners of any object version for the object.
    • subtree-read: lists roles permitted to read any object version content for the object.
  • Object Version
    • owner: lists roles considered to be owners of the object version.
    • read: lists roles permitted to read the object version content.

Lifecycle and Ownership

Access control lists are sub-resources of the main resource identified by the resource name in the URL, and they exist for the entire lifetime of the main resource.

  1. The root namespace is configured out of band with initial ACL content when a service is deployed.
  2. When a client creates a nested namespace or a new object, the ownership of the new resource is set to the authenticated client by default, but the client may specified an alternative owner list as part of the creation request. In the case of a new object, the initial object version gets the same ACL settings as the newly created object. TODO: define header to control initial ACLs during PUT.

Access Control Retrieval

The GET operation is used to retrieve ACL settings en masse:

GET /resource_name;acl
Host: authority_name
Accept: application/json
If-None-Match: etag_value

for which the successful response is:

200 OK
Content-Type: application/json
Content-Length: N
ETag: etag_value

{"access": ["role", ...], ...}

where response contains a JSON object with one field per access mode and an array of role names and/or the * wildcard for each such access list.

The HEAD operation can likewise retrieve en masse ACL metadata:

HEAD /resource_name;acl
Host: authority_name
Accept: application/json

for which the successful response is:

200 OK
Content-Type: application/json
Content-Length: N

Access Control List Retrieval

The GET operation is also used to retrieve a specific ACL:

GET /resource_name;acl/access
Host: authority_name
Accept: application/json
If-None-Match: etag_value

for which the successful response is:

200 OK
Content-Type: application/json
Content-Length: N
ETag: etag_value

["role",...]

where the response contains just one array of role names or the * wildcard.

The HEAD operation can likewise retrieve individual ACL metadata:

HEAD /resource_name;acl/access
Host: authority_name
Accept: application/json

for which the successful response is:

200 OK
Content-Type: application/json
Content-Length: N

Access Control List Entry Retrieval

The GET operation is also used to retrieve a specific ACL entry:

GET /resource_name;acl/access/role
Host: authority_name
Accept: application/json
If-None-Match: etag_value

for which the successful response is:

200 OK
Content-Type: text/plain
Content-Length: N
ETag: etag_value

role

where the response contains just one role name or * wildcard entry.

The HEAD operation is also used to retrieve metadata for a specific ACL entry:

HEAD /resource_name;acl/access/entry
Host: authority_name
Accept: application/json

for which the successful response is:

200 OK
Content-Type: text/plain
Content-Length: N

For all of the ACL sub-resource retrieval operations, an If-None-Match header MAY be specified with an etag value to indicate that the client already possesses a copy of the sub-resource which was returned with an ETag header containing that same etag value. This is useful for cache control. The etag value, if returned by the server, MUST indicate a specific configuration of the ACL sub-resource such that proper caching and precondition-protected updates are possible using the related HTTP protocol features.

Typical GET error responses would be:

  • 401 Unauthorized: the client is not authenticated and anonymous retrieval of such a policy is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to retrieve the policy.
  • 404 Not Found: the namespace or object resource or ACL subresource is not found.
  • 304 Not Modified: the current state of the ACL sub-resource matches the etag value specified in the If-None-Match request header.

Access Control List Update

The PUT operation is used to rewrite a specific ACL:

PUT /resource_name;acl/access
Host: authority_name
Content-Type: application/json
If-Match: etag_value

["role", ...]

The optional If-Match header MAY be specified with the etag value corresponding to the last retrieved ACL sub-resource configuration, in order to prevent update in the case that another client has simultaneously updated the same ACL sub-resource while this request was being prepared and submitted.

The successful response is:

204 No Content

where the input JSON array completely replaces the existing ACL.

It is RECOMMENDED that the implementation reject changes which would strip too many permissions, e.g. leaving a resource with no owner.

The PUT operation is also used to add one entry to a specific ACL:

PUT /resource_name;acl/access/entry
Host: authority_name

for which the successful response is:

204 No Content

where the entry role name or * wildcard is now present in the ACL.

Typical PUT error responses would be:

  • 400 Bad Request: the resource cannot be updates as requested, i.e. because insufficient permissions would remain.
  • 401 Unauthorized: the client is not authenticated and anonymous update of such a resource is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to update the resource.
  • 404 Not Found: the name does not denote an existing resource.
  • 412 Precondition Failed: the update was aborted because the ACL sub-resource state on the server did not match the etag value present in an If-Match request header.

Access Control List Deletion

The DELETE operation is used to clear a specific ACL:

DELETE /resource_name;acl/access
Host: authority_name
If-Match: etag_value

The optional If-Match header MAY be specified with the etag value corresponding to the last retrieved ACL sub-resource configuration, in order to prevent deletion in the case that anohter client has simultaneously updated the same ACL sub-resource while this request was being prepared and submitted.

The successful response is:

204 No Content

where the ACL is now empty.

It is RECOMMENDED that the implementation reject changes which would strip too many permissions, e.g. leaving a resource with no owner.

The DELETE operation is also used to remove one entry from a specific ACL:

GET /resource_name;acl/access/entry
Host: authority_name

for which the successful response is:

204 No Content

where the entry role name or * wildcard is no longer present in the ACL.

Typical DELETE error responses would be:

  • 400 Bad Request: the resource cannot be changed as requested, i.e. because insufficient permissions would remain.
  • 401 Unauthorized: the client is not authenticated and anonymous update of such a resource is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to update the resource.
  • 404 Not Found: the name does not denote an existing resource.
  • 412 Precondition Failed: the deletion was aborted because the ACL sub-resource state on the server did not match the etag value present in an If-Match request header.

Chunked Upload Resources

To efficiently support restartable transfer for very large object content, a stateful job management sub-resource exposes URLs for the form:

  • https:// authority / namespace path / object name ;upload
    • (the set of upload jobs for a given object)
  • https:// authority / namespace path / object name ;upload / job id
    • (a single upload job)
  • https:// authority / namespace path / object name ;upload / job id / chunk number
    • (a single chunk of data)

where job id is a service-issued identifier for one transfer job, and chunk number is a zero-based ordinal for the chunk within the series of chunks where chunk number n starts at byte-offset n * K for a job using K byte chunk size.

To allow different implementations, the upload job processes a set of chunks of equal size determined at the time the job is created. Arbitrary byte offsets are not allowed. The final chunk may be less than the chunk size to account for arbitrary length jobs.

The three-phase chunked upload job has an effect equivalent to a single PUT request on an object:

  1. Create service-side transfer job state
  2. Send set of idempotent chunks
  3. Signal job completion

The benefit of this technique is that individual HTTP requests can be kept to a reasonable size to prevent timeouts, and in the face of transient communication failures the data transfer can easily resume in the middle. Only chunks that were partially in flight need to be retransmitted.

This interface has been designed to accomodate two important implementation strategies:

  • The fixed chunk size and ordinal position can be used to compute a byte offset for direct assembly of data into sparse files in a filesystem. The chunks are non-overlapping byte ranges at fixed offsets. Idempotent retransmission of chunks is permitted, but a client SHOULD NOT send different content for multiple requests using the same chunk number. An implementation MAY mix content of multiple transmissions for the same chunk number. An implementation MAY accept completion of an upload job that has missing chunks.
  • The individual requests easily map to similar chunked upload interfaced in object systems such as Amazon S3, allowing a thin proxy to implement Hatrac on top of such services. Retransmission or out-of-order transmission of chunks is permitted, but a client SHOULD NOT skip any chunks. An implementation MAY reject completion of an upload job that has missing chunks.

Hence, it is the client's responsibility to track acknowledged of individual chunk transfers and defer completion of an upload job until all chunks have been successfully transmitted.

Chunked Upload Job Creation

The POST operation is used to create a new upload job:

POST /namespace_path/object_name;upload
Host: authority_name
Content-Type: application/json

{"chunk-length": K, 
 "content-length": N,
 "content-type": "content_type",
 "content-md5": "hash_value",
 "content-sha256": "hash_value",
 "content-disposition": "disposition"}

where the JSON attributes chunk-length and content-length are mandatory to describe the shape of the data upload, while content-type, content-disposition, content-md5, and content-sha256 are optional and provide additional metadata for the completed object, with the same semantics as if the object had been created as a simple object (without the upload job API) and those same fields had been provided as HTTP PUT request headers. For backwards compatibility, these JSON attribute names are also supported as aliases:

  • chunk_bytes: deprecated alias for chunk-length
  • total_bytes: deprecated alias for content-length
  • content_md5: deprecated alias for content-md5

As with object and namespace creation, an optional query parameter may be supplied to request automatic creation of ancestor namespaces:

POST /namespace_path/object_name;upload?parents=true
...

In either case, the successful response is:

201 Created
Location /namespace_path/object_name;upload/job_id
Content-Type: text/uri-list
Content-Length: N

/namespace_path/object_name;upload/job_id

where the new job is ready to receive data chunks.

Typical PUT error responses would be:

  • 401 Unauthorized: the client is not authenticated and anonymous creation of a job is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to create the job.
  • 409 Conflict: the object name is unavailable for such use.

Chunked Upload Job Listing Retrieval

The GET operation is used to list pending upload jobs on an object:

GET /namespace_path/object_name;upload
Host: authority_name

where the successful response is a JSON array of job URLs:

200 OK
Content-Type: application/json
Content-Length: N

["/hatrac/namespace_path/object_name;upload/job_id", ...]

If the text/uri-list content-type is negotiated, the response will be a whitespace separated list of job URLs:

200 OK
Content-Type: text/uri-list
Content-Length: N

/hatrac/namespace_path/object_name;upload/job_id1
/hatrac/namespace_path/object_name;upload/job_id2
...

Chunk Upload

The PUT operation is used to send data chunks for an existing job:

PUT /namespace_path/object_name;upload/job_id/chunknumber
Host: authority_name
Content-Type: application/octet-stream
Content-Length: K

...data...

for which the successful response is:

204 No Content

where the data was received and stored.

Typical PUT error responses would be:

  • 401 Unauthorized: the client is not authenticated and anonymous upload of the chunk is not supported.
  • 403 Forbidden: the client is authenticated but does not have sufficient privilege to upload the chunk.
  • 400 Bad Request: the chunk number is not a non-negative integer.
  • 409 Conflict: the chunk number is too large for the defined job.

Chunked Upload Job Finalization

The POST operation is used to signal completion of an upload job:

POST /namespace_path/object_name;upload/job_id
Host: authority_name

for which the successful response is:

201 Created
Location: /namespace_path/object_name:version_id
Content-Type: text/uri-list
Content-Length: N

/namespace_path/object_name:version_id

where Location includes the URL of the newly created object version that is comprised of all the uploaded data chunks as if it had been created by a corresponding PUT request:

PUT /namespace_path/object_name
Host: authority_name
Content-Type: content_type
Content-MD5: hash_value
Content-Length: N

...content...

Typical POST error responses would be:

  • 401 Unauthorized the client is not authenticated
  • 403 Forbidden the client is authenticated but does not have sufficient privilege to finalize the upload.
  • 409 Conflict the currently uploaded content does not match the Content-MD5 header of the original upload job. An implementation MAY skip this validation but it is RECOMMENDED to perform this validation rather than create broken objects.

Chunked Upload Job Status Retrieval

The GET operation is used to view the status of a pending upload:

GET /namespace_path/object_name;upload/job_id
Host: authority_name

for which the successful response is:

200 OK
Content-Type: application/json
Content-Length: N

{"url": "/namespace_path/object_name;upload/job_id",
 "owner": ["role"...],
 "target": "/namespace_path/object_name"
 "chunk-length": K,
 "content-length": N,
 ...
}

summarizing the parameters set when the job was created including optional object metadata such as content-type. Note, there is no support for determining which chunks have or have not been uploaded as such tracking is not a requirement placed on Hatrac implementations.

Chunked Upload Job Cancellation

The DELETE method can be used to cancel an upload job that has not yet been finalized:

DELETE /namespace_path/object_name;upload/job_id
Host: authority_name

for which the successful response is:

204 No Content

Once canceled, the job resource no longer exists and associated storage SHOULD be reclaimed.