index.html

<!DOCTYPE html>
<html>
  <head>
    <title>Secure Data Store 0.1</title>
    <meta http-equiv='Content-Type' content='text/html;charset=utf-8'/>
    <!--
      === NOTA BENE ===
      For the three scripts below, if your spec resides on dev.w3 you can check them
      out in the same tree and use relative links so that they'll work offline,
     -->
    <script src='https://www.w3.org/Tools/respec/respec-w3c-common' class='remove'></script>
    <script src="./common.js" class="remove"></script>
    <script type="text/javascript" class="remove">
      var respecConfig = {
        // specification status (e.g., WD, LCWD, NOTE, etc.). If in doubt use ED.
        specStatus: "unofficial",

        // the specification's short name, as in http://www.w3.org/TR/short-name/
        shortName: "secure-data-store",

        // subtitle for the spec
        subtitle: "Secure Data Store",

        // if you wish the publication date to be other than today, set this
        //publishDate: "2019-03-26",
        //crEnd: "2019-04-23",
        //implementationReportURI: "https://w3c.github.io/sdh-test-suite",
        previousMaturity: "CG-DRAFT",
        previousPublishDate: "2020-01-26",
        prevVersion: "https://digitalbazaar.github.io/encrypted-data-vaults/",

        // extend the bibliography entries
        localBiblio: ccg.localBiblio,
        doJsonLd: true,

        github: "https://github.com/decentralized-identity/secure-data-store",
        includePermalinks: false,

        // if there a publicly available Editor's Draft, this is the link
        edDraftURI: "https://identity.foundation/secure-data-store/",

        // Override respec autogenerated w3c URLs
        thisVersion: "https://identity.foundation/secure-data-store/",
        latestVersion: "https://identity.foundation/secure-data-store/",

        // if this is a LCWD, uncomment and set the end of its review period
        // lcEnd: "2009-08-05",

        // editors, add as many as you like
        // only "name" is required
        editors: [
          { name: "Manu Sporny", url: "http://manu.sporny.org/",
            company: "Digital Bazaar", companyURL: "http://digitalbazaar.com/"},
          { name: "Daniel Buchner", url: "https://www.linkedin.com/in/dbuchner/",
            company: "Microsoft", companyURL: "https://microsoft.com/"},
        { name: "Orie Steele", url: "https://www.linkedin.com/in/or13b/",
            company: "Transmute", companyURL: "https://transmute.industries" },
        ],
        // authors, add as many as you like.
        // This is optional, uncomment if you have authors as well as editors.
        // only "name" is required. Same format as editors.
        // authors:
        // [
        // ],
        // name of the WG
        // wg:     "Secure Data Storage Working Group",

        // URI of the public WG page
        // wgURI:        "https://www.w3.org/community/credentials/",

        // name (with the @w3c.org) of the public mailing to which comments are due
        // wgPublicList: "public-credentials",

        // URI of the patent status for this WG, for Rec-track documents
        // !!!! IMPORTANT !!!!
        // This is important for Rec-track documents, do not copy a patent URI from a random
        // document unless you know what you're doing. If in doubt ask your friendly neighborhood
        // Team Contact.
        // wgPatentURI:  "https://www.w3.org/2004/01/pp-impl/98922/status",
        maxTocLevel: 2,
        inlineCSS: true
      };
    </script>
    <style>
pre .highlight {
  font-weight: bold;
  color: green;
}
pre .comment {
  font-weight: bold;
  color: Gray;
}
.color-text {
  font-weight: bold;
  text-shadow: -1px 0 black, 0 1px black, 1px 0 black, 0 -1px black;
}
    </style>
  </head>
  <body>
    <section id='abstract'>
      <p>
We store a significant amount of sensitive data online, such as personally
identifying information (PII), trade secrets, family pictures, and customer
information. The data that we store is often not protected in an appropriate
manner.
      </p>
      <p>
This specification describes a privacy-respecting mechanism for storing,
indexing, and retrieving encrypted data at a storage provider. It is often
useful when an individual or organization wants to protect data in a way that
the storage provider cannot view, analyze, aggregate, or resell the data.
This approach also ensures that application data is portable and protected
from storage provider data breaches.
      </p>
    </section>
    <section id='sotd'>
      <p>
This specification is a joint work item of the
<a href="https://www.w3.org/community/credentials/">W3C Credentials Community
Group</a> and the <a href="https://identity.foundation/">Decentralized
Identity Foundation</a>.
This specification is a combination of and iteration on work done by both of
these groups. Input documents, or parts thereof, which have not yet been
integrated into the specification may be found in the appendices.
      </p>
    </section>

    <section class="informative">
      <h2>
Introduction
      </h2>

      <p>
We store a significant amount of sensitive data online, such as personally
identifying information (PII), trade secrets, family pictures, and customer
information. The data that we store is often not protected in an appropriate
manner.
      </p>
      <p>
Legislation, such as the General Data Protection Regulation (GDPR), incentivizes
service providers to better preserve individuals' privacy, primarily through
making the providers liable in the event of a data breach. This liability
pressure has revealed a technological gap, whereby providers are often not
equipped with technology that can suitably protect their customers. Encrypted
Data Vaults fill this gap and provide a variety of other benefits.
      </p>
      <p>
This specification describes a privacy-respecting mechanism for storing,
indexing, and retrieving encrypted data at a storage provider. It is often
useful when an individual or organization wants to protect data in a way that
the storage provider cannot view, analyze, aggregate, or resell the data.
This approach also ensures that application data is portable and protected
from storage provider data breaches.
      </p>

      <section class="informative">
        <h3>
Why Do We Need Encrypted Data Vaults?
        </h3>

        <p class="issue">
Explain why individuals and organizations that want to protect their privacy,
trade secrets, and ensure data portability will benefit from using this
technology. Explain how giving a standard API for the storage of user data
empowering users to "bring their own storage", giving them control of their
own information. Explain how applications that are written against a standard
API and assume that users will bring their own storage can separate concerns
and focus on the functionality of their application, removing the need to
deal with storage infrastructure (instead leaving it to a specialist service
provider that is chosen by the user).
        </p>
        <p>
Requiring client-side (edge) encryption for all data and metadata at the same
time as enabling the user to store data on multiple devices and to share data
with others, whilst also having searchable or queryable data, has been
historically very difficult to implement in one system. Trade-offs are often
made which sacrifice privacy in favor of usability, or vice versa.
        </p>
        <p>
Due to a number of maturing technologies and standards, we are hopeful that such
trade-offs are no longer necessary, and that it is possible to design a
privacy-preserving protocol for encrypted decentralized data storage that has
broad practical appeal.
        </p>
      </section>

      <section class="informative">
        <h3>
Ecosystem Overview
        </h3>

        <p>
The problem of decentralized data storage has been approached from various
different angles, and personal data stores (PDS), decentralized or otherwise,
have a long history in commercial and academic settings. Different approaches
have resulted in variations in terminology and architectures.
The diagram below shows the types of components that are emerging, and the roles
they play. Encrypted Data Vaults fulfill the low-level encrypted
<em>storage</em> role.
        </p>

        <figure>
          <img style="margin: auto; display: block; width: 75%;"
         src="diagrams/SDS_Layers.svg" alt="diagram showing
         the roles of different technologies in the encrypted
         data vaults and secure data store ecosystem and how they interact.">
          <figcaption style="text-align: center;">
            Secure Data Storage layers
          </figcaption>
        </figure>

        <p>
This section describes the roles of the core actors and the relationships
between them in an ecosystem where this specification is expected
to be useful. A role is an abstraction that might be implemented in many
different ways. The separation of roles suggests likely interfaces and
protocols for standardization. The following roles are introduced in this
specification:
        </p>

        <dl>
          <dt><dfn>data vault controller</dfn></dt>
          <dd>
A role an <a>entity</a> might perform by creating, managing, and deleting
data vaults. This entity is also responsible for granting and revoking
authorization to <a>storage agents</a> to the data vaults that are under its
control.
          </dd>
          <dt><dfn data-lt="storage agents">storage agent</dfn></dt>
          <dd>
A role an <a>entity</a> might perform by creating, updating, and deleting
data in a data vault. This entity is typically granted authorization to
to access a data vault by a <a>data vault controller</a>.
          </dd>
          <dt><dfn>storage provider</dfn></dt>
          <dd>
A role an <a>entity</a> might perform by providing a raw data storage
mechanism to a <a>data vault controller</a>. It is impossible for this entity
to see the data that it is storing due to all data being encrypted at rest
and in transit to and from the <a>storage provider</a>.
          </dd>
        </dl>

      </section>
      <section class="informative">
      <h3>
Prior Art
        </h3>

        <h4>
          Peergos
        </h4>
        <p>
          <a href="https://github.com/peergos/peergos">Peergos</a> has many of the same requirements as SDS (actually stronger privacy requirements, especially against a quantum computer) and is built on top of ipfs.
          In summary, its properties include:
          <ul>
            <li>
              global p2p private filesystem
            </li>
            <li>
              strong end to end encryption, and fine grained access control (read only and writable) (capability based, so server's are trustless)
            </li>
            <li>
              data model is hash linked data (IPLD) with updates signed by a keypair
            </li>
            <li>
              hide metadata from server including file names, mime-types, file sizes, directory topology 
            </li>
            <li>
              hide social graph from server (the server cannot see who has been granted access to what)
            </li>
            <li>
              directories are indistinguishable from small files to the server
            </li>
            <li>
              independent of DNS and TLS certificate authorities (though there is a web interface if you trust them)
            </li>
            <li>
              handle arbitrarily large files, including streaming, and O(1) seeking
            </li>
            <li>
              data can be trivially mirrored which provides live redundancy over the ipfs protocol
            </li>
            <li>
              efficient modification of large files without having to re-encrypt the entire file
            </li>
            <li>
              resistant to quantum computer based attacks - unshared files are already fully post-quantum, shared files currently have a limited time window of vulnerability to a large quantum computer
            </li>
            <li>
              access control is done with a version of <a href="https://github.com/Peergos/Peergos/raw/master/papers/wuala-cryptree.pdf">cryptree</a>, improved to fit in the ipfs data model, which is a stunning data structure. More details can be read <a href="https://book.peergos.org/security/cryptree.html">here</a>.
            </li>
            <li>
              all data is in a merkle-champ (compressed hash-array mapped prefix-trie) which is a great data structure for content addressed mutable data, and plays very well with CRDTs as well. The original paper on the CHAMP structure is https://michael.steindorfer.name/publications/oopsla15.pdf. The properties of champs that are useful are insertion order indepdendence (giving a canonical root for a given set of mappings), and balanced structure. 
            </li>
            <li>
              Unshared files are safe from exposure by a large quantum computer (this is because their privacy only relies on hashing and symmetric encryption, neither of which are significantly weakened by a quantum computer). 
            </li>
            <li>
              Peergos servers are trustless. The worst that a malicious server could do is delete your data or withhold valid updates, both of which are easily detected and mitigated by running a mirror. 
            </li>
          </ul>
        More technical descriptions are available <a href="https://book.peergos.org">here</a>
        </p>
      </section>
      <section class="informative">
        <h3>
Use Cases
        </h3>

        <p class="issue">
          Use cases have been moved to a distinct <a href=use_cases.md>markdown document</a>.
        </p>

        <section>
          <h4>
Deployment topologies
          </h4>
          <p>
Based on the use cases, we consider the following deployment topologies:
          </p>
          <ul>
            <li>
<strong>Mobile Device Only:</strong> The server and the client reside on the
same device. The vault is a library providing functionality via a binary API,
using local storage to provide an encrypted database.
            </li>
            <li>
<strong>Mobile Device Plus Cloud Storage:</strong>A mobile device plays the role
of a client, and the server is a remote cloud-based service provider that has
exposed the storage via a network-based API (eg. REST over HTTPS). Data is not
stored on the mobile device.
            </li>
            <li>
<strong>Multiple Devices (Single User) Plus Cloud Storage:</strong> When adding
more devices managed by a single user, the vault can be used to synchronize data
across devices.
            </li>
            <li>
<strong>Multiple Devices (Multiple Users) Plus Cloud Storage:</strong> When
pairing multiple users with cloud storage, the vault can be used to synchronize
data between multiple users with the help of replication and merge strategies.
            </li>
            <li>
<p><strong>Multi-/Cross-cloud:</strong> Some use cases (IoT / machine to machine
/ Skynet / guardianship ) require a non-human or non-functioning actor to
delegate KMS/key control to a cloud vault for oversight or human intervention.
In the case of some Password manager use case architectures or biometrically
accessed/deployed key material storage, as well as some multi-cloud/hybrid-cloud
architectures, key material will need to be retrieved from at least one other
vault before accessing the vault being specified here.</p>

<p>Keys in control of such an entity might still need to securely store signed
credentials or data in a separate vault. Additional diagramming or
specifications will be needed to show how this 2-vault solution could be
constrained to be secure and feasible, even if non-normative.</p>
            </li>
            <li>
<strong>Self-Hosted and/or Home-based Server:</strong> Alice wants to host her
own SDS software instance, on her own server.
            </li>
            <li>
<strong>Support Low Power Devices/Non-private computing:</strong> To support
users without access to private computing resources, the following three
components need to be considered:

<ol>
  <li>Secure Storage</li>
  <li>Key vault - private key storage and recovery (Key management)</li>
  <li>Trusted computing - computational resources which have access to private
keys and plain text private data</li>
</ol>
  </li>
          </ul>
        </section>
      </section>

      <section class="informative">
        <h3>
Requirements
        </h3>

        <p>
The following sections elaborate on the requirements that have been gathered
from the core use cases.
        </p>

        <section>
          <h4>
Privacy and multi-party encryption
          </h4>
          <p>
One of the main goals of this system is ensuring the privacy of an entity's
data so that it cannot be accessed by unauthorized parties, including the
storage provider.
          </p>
          <p>
To accomplish this, the data must be encrypted both while it is in transit
(being sent over a network) and while it is at rest (on a storage system).
          </p>
          <p>
Since data could be shared with more than one entity, it is also necessary for
the encryption mechanism to support encrypting data to multiple parties.
          </p>
        </section>

        <section>
          <h4>
Sharing and authorization
          </h4>
          <p>
It is necessary to have a mechanism that enables authorized sharing
of encrypted information among one or more entities.
          </p>
          <p>
The system is expected to specify one mandatory authorization scheme,
but also allow other alternate authorization schemes. Examples of
authorization schemes include OAuth2, Web Access Control, and
[[ZCAP]]s (Authorization Capabilities).
          </p>
        </section>

        <section>
          <h4>
Identifiers
          </h4>
          <p>
The system should be identifier agnostic. In general, identifiers that are a
form of URN or URL are preferred. While it is presumed that [[DID-CORE]]
(Decentralized Identifiers, DIDs) will be used by the system in a few important ways, hard-coding the implementations to DIDs would be an anti-pattern.
          </p>
        </section>

        <section>
          <h4>
Versioning and replication
          </h4>
          <p>
It is expected that information can be backed up on a continuous basis. For this
reason, it is necessary for the system to support at least one mandatory
versioning strategy and one mandatory replication strategy, but also allow other
alternate versioning and replication strategies.
          </p>
        </section>

        <section>
          <h4>
Metadata and searching
          </h4>
          <p>
Large volumes of data are expected to be stored using this system, which then
need to be efficiently and selectively retrieved. To that end, an encrypted
search mechanism is a necessary feature of the system.
          </p>
          <p>
It is important for clients to be able to associate metadata with the data such
that it can be searched. At the same time, since privacy of both data <em>and</em>
metadata is a key requirement, the metadata must be stored in an encrypted
state, and service providers must be able to perform those searches in an opaque
and privacy-preserving way, without being able to see the metadata.
          </p>
        </section>

        <section>
          <h4>
Protocols
          </h4>
          <p>
Since this system can reside in a variety of operating environments, it is
important that at least one protocol is mandatory, but that other protocols are
also allowed by the design. Examples of protocols include HTTP, gRPC, Bluetooth,
and various binary on-the-wire protocols. An HTTPS API is defined in <a href="#data-vault-https-api"></a>.
          </p>
        </section>

      </section>

      <section class="informative">
        <h3>
  Design goals
        </h3>
        <p>
This section elaborates upon a number of guiding principles and design goals
that shape Encrypted Data Vaults.
        </p>

        <section>
          <h4>
  Layered and modular architecture
          </h4>
          <p>
A layered architectural approach is used to ensure that the foundation for the
system is easy to implement while allowing more complex functionality to be
layered on top of the lower foundations.
          </p>
          <p>
For example, Layer 1 might contain the mandatory features for the most basic
system, Layer 2 might contain useful features for most deployments, Layer 3
might contain advanced features needed by a small subset of the ecosystem, and
Layer 4 might contain extremely complex features that are needed by a very small
subset of the ecosystem.
          </p>
        </section>

        <section>
          <h4>
Prioritize privacy
          </h4>
          <p>
This system is intended to protect an entity's privacy. When exploring new
features, always ask "How would this impact privacy?". New features that
negatively impact privacy are expected to undergo extreme scrutiny to determine
if the trade-offs are worth the new functionality.
          </p>
        </section>

        <section>
          <h4>
Push implementation complexity to the client
          </h4>
          <p>
Servers in this system are expected to provide functionality strongly focused on
the storage and retrieval of encrypted data. The more a server knows, the
greater the risk to the privacy of the entity storing the data, and the more
liability the service provider might have for hosting data. In addition, pushing
complexity to the client enables service providers to provide stable server-side
implementations while innovation can by carried out by clients.
          </p>
        </section>

      </section>

      <section id="conformance" class="normative">
      </section>

    </section>

    <section class="informative">
      <h2>
Terminology
      </h2>

      <div data-include="./terms.html"
        data-oninclude="restrictReferences">
      </div>
    </section>

    <section class="informative">
      <h2>
Core Concepts
      </h2>

      <p>
The following sections outline core concepts, such as encrypted storage,
which form the foundation of this specification.
      </p>

      <section class="normative">
        <h2>
  Encrypted Storage
      </h2>

        <p>
An important consideration of encrypted data stores is which components of the
architecture have access to the (unencrypted) data, or who controls the private
keys. There are roughly three approaches: storage-side encryption, client-side
(edge) encryption, and gateway-side encryption (which is a hybrid of the
previous two).
        </p>
        <p>
Any data storage systems that let the user store arbitrary data also support
client-side encryption at the most basic level. That is, they let the user
encrypt data themselves, and then store it. This doesn't mean these systems are
optimized for encrypted data however. Querying and access control for encrypted
data may be difficult.
        </p>
        <p>
Storage-side encryption is usually implemented as whole-
<a href="https://en.wikipedia.org/wiki/Disk_encryption">disk encryption</a>
or filesystem-level encryption. This is widely supported and understood, and any
type of hosted cloud storage is likely to use storage-side encryption. In this
scenario the private keys are managed by the service provider or controller of
the storage server, which may be a different entity than the user who is storing
the data. Encrypting the data while it resides on disk is a useful security
measure should physical access to the storage hardware be compromised, but does
not guarantee that <em>only</em> the original user who stored the data has access.
        </p>
        <p>
Conversely, client-side encryption offers a high level of
security and privacy, especially if metadata can be encrypted as well. Encryption
is done at the individual data object level, usually aided by a keychain or wallet
client, so the user has direct access to the private keys. This comes at a cost,
however, since the significant responsibility of key management and recovery falls
squarely onto the end user. In addition, the question of key management becomes
more complex when data needs to be shared.
        </p>
        <p>
Gateway-side encryption systems take an approach that combines
techniques from storage-side and client-side encryption architectures. These
storage systems, typically encountered among multi-server clusters or some
"encryption as a platform" cloud service providers, recognize that client-side
key management may be too difficult for some users and use cases, and offer to
perform encryption and decryption themselves in a way that is transparent to
the client application. At the same time, they aim to minimize the number of
components (storage servers) that have access to the private decryption keys.
As a result, the keys usually reside on "gateway" servers, which encrypt the
data before passing it to the storage servers. The encryption/decryption is
transparent to the client, and the data is opaque to the storage servers, which
can be modular/pluggable as a result. Gateway-side encryption provides some
benefits over storage-side systems, but also share the drawbacks: the gateway
sysadmin controls the keys, not the user.
        </p>
      </section>

      <section class="normative">
        <h2>
  Structured Documents
      </h2>

        <p>
The fundamental unit of storage in data vaults is the encrypted
structured document which, when decrypted, provides a data structure that
can be expressed in popular syntaxes such as JSON and CBOR. Documents can
store structured data and metadata about the structured data. Structured
document sizes are limited to 16MB.
        </p>
      </section>

      <section class="normative">
        <h2>
  Streams
      </h2>

        <p>
For files larger than 16MB or for raw binary data formats such as audio,
video, and office productivity files, a streaming API is provided that
enables data to be streamed to/from a data vault. Streams are described using
structured documents, but the storage of the data is separated from the
structured document using a hashlink to the encrypted content.
        </p>
      </section>

      <section class="normative">
        <h2>
  Indexing
      </h2>

        <p>
Data vaults are expected to store a very large number of documents
of varying kinds. This means that it is important to be able to search the
documents in a timely way, which creates a challenge for the storage provider
as the content is encrypted. Previously this has been worked around
with a certain amount of unencrypted metadata attached to the data objects.
Another possibility is unencrypted listings of pointers to filtered subsets
of data.
        </p>
        <p>
In the case of data vaults, an encrypted search scheme is provided for
secure data vaults that enable data vault clients to do meta data indexing while
<em>not leaking</em> metadata to the storage provider.
        </p>
      </section>

    </section>

    <section class="normative">
      <h2>
Architecture
      </h2>

      <p class="issue">
Review this section for language that should be properly normative.
      </p>

      <p>
This section describes the architecture of the Encrypted Data Vault protocol, in
the form of a client-server relationship. The vault isregarded as the server and
the client acts as the interface used to interact with the vault.
      </p>
      <p>
This architecture is layered in nature, where the foundational layer consists of
an operational system with minimal features, and where more advanced features are
layered on top. Implementations can choose to implement only the foundational
layer, or optionally, additional layers consisting of a richer set of features
for more advanced use cases.
      </p>

      <section>
        <h3>
Server and client responsibilities
        </h3>
        <p>
The server is assumed to be of low trust, and must have no visibility into the
data that it persists. However, even in this model, the server still has a set
of minimum responsibilities it must adhere to.
        </p>
        <p>
The client is responsible for providing an interface to the server, with
bindings for each relevant protocol (HTTP, RPC, or binary over-the-wire
protocols), as required by the implementation.
        </p>
        <p>
All encryption and decryption of data is done on the client side, at the edges.
The data (including metadata) MUST be opaque to the server, and the architecture
is designed to prevent the server from being able to decrypt it.
        </p>
      </section>

      <section>
        <h3>
Layer 1 (L1) responsibilities
        </h3>
        <p>
Layer 1 consists of a client-server system that is capable of encrypting data in
transit and at rest.
        </p>

        <section>
          <h4>
Server: validate requests (L1)
          </h4>
          <p>
When a vault client makes a request to store, query, modify, or delete data in
the vault, the server validates the request. Since the actual data and metadata
in any given request is encrypted, such validation is necessarily limited and
largely depends on the protocol and the semantics of the request.
          </p>
        </section>

        <section>
          <h4>
Server: Persist data (L1)
          </h4>
          <p>
The mechanism a server uses to persist data, such as storage on a local,
networked, or distributed file system, is determined by the implementation. The
persistence mechanism is expected to adhere to the common expectations of a data
storage provider, such as reliable storage and retrieval of data.
          </p>
        </section>

        <section>
          <h4>
Server: Persist global configuration (L1)
          </h4>
          <p>
A vault has a global configuration that defines the following properties:
          </p>
          <ul>
            <li>
Stream chunk size
            </li>
            <li>
Other config metadata
            </li>
          </ul>
          <p>
The configuration allows the the client to perform capability discovery
regarding things like authorization, protocol, and replication mechanisms that are used
by the server.
          </p>
        </section>

        <section>
          <h4>
Server: enforcement of authorization policies (L1)
          </h4>
          <p>
When a client makes a request to store, query, modify, or delete data in
the vault, the server enforces any authorization policy that is associated with
the request.
          </p>
        </section>

        <section>
          <h4>
Client: encrypted data chunking (L1)
          </h4>
          <p>
An Encrypted Data Vault is capable of storing many different types of data,
including large unstructured binary data. This means that storing a file as a
single entry would be challenging for systems that have limits on single record
sizes. For example, some databases set the maximum size for a single record to
16MB. As a result, it is necessary that large data is chunked into sizes that
are easily managed by a server. It is the responsibility of the client to set
the chunk size of each resource and chunk large data into manageable chunks for
the server. It is the responsibility of the server to deny requests to store
chunks larger that it can handle.
          </p>
          <p>
Each chunk is encrypted individually using authenticated encryption. Doing so
protects against attacks where an attacking server replaces chunks in a large
file and requires the entire file to be downloaded and decrypted by the victim
before determining that the file is compromised. Encrypting each chunk with
authenticated encryption ensures that a client knows that it has a valid chunk
before proceeding to the next one. Note that another authorized client can still
perform an attack by doing authenticated encryption on a chunk, but a server is
not capable of launching the same attack.
          </p>
        </section>

        <section>
          <h4>
Client: Resource structure (L1)
          </h4>
          <p>
The process of storing encrypted data starts with the creation of a Resource by
the client, with the following structure.
          </p>
          <p>
Resource:
          </p>
          <ul>
            <li>
<code>id</code> (required)
            </li>
            <li>
<code>meta</code>
              <ul>
                <li>
  <code>meta.contentType</code> MIME type
                </li>
              </ul>
            </li>
            <li>
<code>content</code> - entire payload, or a manifest-like list of hashlinks to individual chunks
            </li>
          </ul>
          <p>
If the data is less than the chunk size, it is embedded directly into the
<code>content</code>.
          </p>
          <p>
Otherwise, the data is sharded into chunks by the client (see next section), and
each chunk is encrypted and sent to the server. In this case, <code>content</code>
contains a manifest-like listing of URIs to individual chunks (integrity-protected
by [[HASHLINK]].
          </p>
        </section>

        <section>
          <h4>
Client: Encrypted resource structure (L1)
          </h4>
          <p>
The process of creating the Encrypted Resource. If the data was sharded into
chunks, this is done after the individual chunks are written to the server.
          </p>
          <ul>
            <li>
<code>id</code>
            </li>
            <li>
<code>index</code> - encrypted index tags prepared by the client (for use with
privacy-preserving querying over encrypted resources)
            </li>
            <li>
<em>Chunk size</em> (if different from the default in global config)
            </li>
            <li>
<em>Versioning metadata</em> - such as sequence numbers, Git-like hashes, or other mechanisms
            </li>
            <li>
<em>Encrypted resource payload</em> - encoded as a <code>jwe</code> [[RFC7516]], <code>cwe</code> [[RFC8152]] or other appropriate mechanism
          </p>
        </section>
      </section>

      <section>
        <h3>
Layer 2 (L2) responsibilities
        </h3>
        <p>
Layer 2 consists of a system that is capable of sharing data among multiple
entities, of versioning and replication, and of performing privacy-preserving searches
in an efficient manner.
        </p>

        <section>
          <h4>
Client: Encrypted search indexes (L2)
          </h4>
          <p>
To enable privacy-preserving querying (where the search index is opaque to the
server), the client must prepare a list of encrypted index tags (which are stored
in the Encrypted Resource, alongside the encrypted data contents).
          </p>
          <p class="issue">
Need details about salting and encryption mechanism of index tags.
          </p>
        </section>

        <section>
          <h4>
Client: Versioning and replication (L2)
          </h4>
          <p>
A server must support <em>at least one</em> versioning/change control mechanism.
Replication is done by the client, not by the server (since the client controls
the keys, knows about which other servers to replicate to, etc.). If an
Encrypted Data Vault implementation aims to provide replication functionality,
it MUST also pick a versioning/change control strategy (since replication
necessarily involves conflict resolution). Some versioning strategies are
implicit ("last write wins", eg. <code>rsync</code> or uploading a file to a file
hosting service), but keep in mind that a replication strategy <em>always</em> implies
that some sort of conflict resolution mechanism should be involved.
          </p>
        </section>

        <section>
          <h4>
Client: Sharing with other entities
          </h4>
          <p>
An individual vault's choice of authorization mechanism determines how a client
shares resources with other entities (authorization capability link or similar
mechanism).
          </p>
        </section>

      </section>

      <section>
        <h3>
Layer 3 (L3) responsibilities
        </h3>

        <section>
          <h4>
Server: Notifications (L3)
          </h4>
          <p>
It is helpful if data storage providers are able to notify clients when  changes
to persisted data occurs. A server may optionally implement a mechanism by which
clients can subscribe to changes in the vault.
          </p>
        </section>

        <section>
          <h4>
Client: Vault-wide integrity protection (L3)
          </h4>
          <p>
Vault-wide integrity protection is provided to prevent a variety of storage
provider attacks where data is modified in a way that is undetectable, such as
if documents are reverted to older versions or deleted. This protection
requires that a global catalog of all the resource identifiers that belong to a
user, along with the most recent version, is stored and kept up to date by the
client. Some clients may store a copy of this catalog locally (and
include integrity protection mechanism such as [[HASHLINK]] to guard against
interference or deletion by the server.
          </p>
        </section>

      </section>
    </section>

    <section class="normative">
      <h2>
Data Model
      </h2>

      <p>
The following sections outlines the data model for data vaults.
      </p>

      <section>
        <h3>
DataVaultConfiguration
        </h3>

        <p class="issue">
Data vault configuration isn't strictly necessary for using the other features
of data vaults. This should have its own conformance section/class or potentially
event be non-normative.
        </p>

        <p>
A data vault configuration specifies the properties a particular data vault
will have.
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">Property</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>sequence</td>
              <td>
A unique counter for the data vault in order to ensure that
clients are properly synchronized to the data vault. The value is required and
MUST be an unsigned 64-bit number.
              </td>
            </tr>
            <tr>
              <td>controller</td>
              <td>
The entity or cryptographic key that is in control of the
data vault. The value is required and MUST be a URI.
              </td>
            </tr>
            <tr>
              <td>invoker</td>
              <td>
The root entities or cryptographic key(s) that are authorized to invoke
an authorization capability to modify the data vault's configuration or
read or write to it. The value is optional, but if present, MUST be a URI
or an array of URIs. When this value is not present, the value of controller
property is used for the same purpose.
              </td>
            </tr>
            <tr>
              <td>delegator</td>
              <td>
The root entities or cryptographic key(s) that are authorized to delegate
authorization capabilities to modify the data vault's configuration or
read or write to it. The value is optional, but if present, MUST be a URI
or an array of URIs. When this value is not present, the value of controller
property is used for the same purpose.
              </td>
            </tr>
            <tr>
              <td>referenceId</td>
              <td>
Used to express an application-specific reference identifier. The value is
optional and, if present, MUST be a string.
              </td>
            </tr>
            <tr>
              <td>keyAgreementKey.id</td>
              <td>
An identifier for the key agreement key. The value is required and MUST be
a URI. The key agreement key is used to derive a secret that is then used to
generate a key encryption key for the receiver.
              </td>
            </tr>
            <tr>
              <td>keyAgreementKey.type</td>
              <td>
The type of key agreement key. The value is required and MUST be or map to
a URI.
              </td>
            </tr>
            <tr>
              <td>hmac.id</td>
              <td>
An identifier for the HMAC key. The value is required a MUST be or map to a URI.
              </td>
            </tr>
            <tr>
              <td>hmac.type</td>
              <td>
The type of HMAC key. The value is required and MUST be or map to
a URI.
              </td>
            </tr>
          </tbody>
        </table>

        <pre class="example highlight" title="Example data vault configuration">
{
  "sequence": 0,
  "controller": "did:example:123456789",
  "referenceId": "my-primary-data-vault",
  "keyAgreementKey": {
    "id": "https://example.com/kms/12345",
    "type": "X25519KeyAgreementKey2019"
  },
  "hmac": {
    "id": "https://example.com/kms/67891",
    "type": "Sha256HmacKey2019"
  }
}
        </pre>

      </section>

      <section>
        <h3>
  <dfn>StructuredDocument</dfn>
        </h3>

        <p>
A structured document is used to store application data as well as metadata
about the application data. This information is typically encrypted and then
stored on the data vault.
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">Property</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>id</td>
              <td>
An identifier for the structured document. The value is required and MUST be a
Base58-encoded 128-bit random value.
              </td>
            </tr>
            <tr>
              <td>meta</td>
              <td>
Key-value metadata associated with the structured document.
              </td>
            </tr>
            <tr>
              <td>content</td>
              <td>
Key-value content for the structured document.
              </td>
            </tr>
          </tbody>
        </table>

        <pre class="example highlight" title="Example structured document">
{
  "id": "urn:uuid:94684128-c42c-4b28-adb0-aec77bf76044",
  "meta": {
    "created": "2019-06-18"
  },
  "content": {
    "message": "Hello World!"
  }
}
        </pre>

        <section>
          <h4>
  <dfn>Streams</dfn>
          </h4>

          <p>
Streams can be used to store images, video, backup files, and any other
binary data of arbitrary length. This is performed by using the
<code>stream</code> property and additional metadata that further identifies
the type of stream being stored. This table below provides the metadata
to be stored in addition to the values specified in <a>StructuredDocument</a>.
          </p>

          <table class="simple">
            <thead>
              <th style="white-space: nowrap">Property</th>
              <th>Description</th>
            </thead>
            <tbody>
              <tr>
                <td>meta.chunks</td>
                <td>
Specifies the number of chunks in the stream.
                </td>
              </tr>
              <tr>
                <td>stream.id</td>
                <td>
The identifier for the stream. The stream identifier MUST be a URI that
references a stream on the same data vault. Once the stream has been written
to the data vault, the content identifier MUST be updated such that it is a
valid hashlink. To allow for streaming encryption, the value of the digest
for the stream is assumed to be unknowable until after the
stream has been written. The hashlink MUST exist as a content hash for the
stream that has been written to the data vault.
                </td>
              </tr>
            </tbody>
          </table>

          <pre class="example highlight" title="Example structured document containing a stream">
{
  "id": "urn:uuid:41289468-c42c-4b28-adb0-bf76044aec77",
  "meta": {
    "created": "2019-06-19",
    "contentType": "video/mpeg",
    "chunks": 16
  },
  "stream": {
    "id": "https://example.com/encrypted-data-vaults/zMbxmSDn2Xzz?hl=zb47JhaKJ3hJ5Jkw8oan35jK23289Hp"
  }
}
          </pre>

        </section>

      </section>

      <section>
        <h3>
  <dfn>EncryptedDocument</dfn>
        </h3>

        <p>
An encrypted document is used to store a structured document in a way that
ensures that no entity can read the information without the consent of the
data controller.
        </p>

        <p class="issue">
While the table below is a simple version of an EncryptedDocument, there is
no other table that yet describes the indexed property and its subproperties,
should it be present on an EncryptedDocument.
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">Property</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>id</td>
              <td>
An identifier for the encrypted document. The value is required and MUST be a
Base58-encoded 128-bit random value.
              </td>
            </tr>
            <tr>
              <td>sequence</td>
              <td>
A unique counter for the data vault in order to ensure that
clients are properly synchronized to the data vault. The value is required and
MUST be an unsigned 64-bit number.
              </td>
            </tr>
            <tr>
              <td>jwe or cwe</td>
              <td>
A JSON Web Encryption or COSE Encrypted value that, if decoded, results in
the corresponding <a>StructuredDocument</a>.
              </td>
            </tr>
          </tbody>
        </table>

        <p class="issue">
Another example should be added that shows that a Diffie-Hellman key can
be identified in the JWE recipients field. This type of key can be used for
key agreement on a key wrapping key.
        </p>

        <p class="issue">
Another section should detail that data vault servers may omit certain fields
or certain values in certain fields, such as the recipients field, based on
whether or not the entity requesting an EncryptedDocument is authorized to
see the field or its values. This can be finely controlled through the use
of Authorization Capabilities.
        </p>

        <pre class="example highlight" title="Example encrypted document">
{
  "id":"z19x9iFMnfo4YLsShKAvnJk4L",
  "sequence":0,
  "indexed":[
    {
      "hmac":{
        "id":"did:ex:12345#key1",
        "type":"Sha256HmacKey2019"
      },
      "sequence":0,
      "attributes":[
      ]
    }
  ],
  "jwe":{
    "protected":"eyJlbmMiOiJDMjBQIn0",
    "recipients":[
      {
        "header":{
          "kid":"urn:123",
          "alg":"ECDH-ES+A256KW",
          "epk":{
            "kty":"OKP",
            "crv":"X25519",
            "x":"d7rIddZWblHmCc0mYZJw39SGteink_afiLraUb-qwgs"
          },
          "apu":"d7rIddZWblHmCc0mYZJw39SGteink_afiLraUb-qwgs",
          "apv":"dXJuOjEyMw"
        },
        "encrypted_key":"4PQsjDGs8IE3YqgcoGfwPTuVG25MKjojx4HSZqcjfkhr0qhwqkpUUw"
      }
    ],
    "iv":"FoJ5uPIR6HDPFCtD",
    "ciphertext":"tIupQ-9MeYLdkAc1Us0Mdlp1kZ5Dbavq0No-eJ91cF0R0hE",
    "tag":"TMRcEPc74knOIbXhLDJA_w"
  }
}
        </pre>

      </section>

    </section>

    <section class="normative">
      <h2>
Data vault HTTPS API
      </h2>

      <p>
This section introduces the HTTPS API for interacting with data vaults and
their contents.
      </p>

      <section>
        <h3>
Discovering Service Endpoints
        </h3>

        <p>
A website may provide service endpoint discovery by embedding JSON-LD in their
top-most HTML web page (e.g. at <code>https://example.com/</code>):
        </p>

        <pre class="example highlight" title="Example of HTML-based service description">
&lt;!DOCTYPE html>
&lt;html lang="en">
  &lt;head>
    &lt;meta charset="utf-8">
    &lt;title>Example Website&lt;/title>
    &lt;link rel="stylesheet" href="style.css">
    &lt;script src="script.js">&lt;/script>
    &lt;script type="application/ld+json">
{
  "@context": "https://w3id.org/encrypted-data-vaults/v1",
  "id": "https://example.com/",
  "name": "Example Website",
  "dataVaultManagementService": "https://example.com/data-vaults"
}
    &lt;/script>
  &lt;/head>
  &lt;body>
    &lt;!-- page content -->
  &lt;/body>
&lt;/html>
        </pre>

        <p>
Service descriptions may also be requested via content negotiation.
In the following example a JSON-compatible service description is provided
(e.g. <code>curl -H "Accept: application/json" https://example.com/</code>):
        </p>

        <pre class="example highlight" title="Example of a JSON-based service description">
{
  "@context": "https://w3id.org/encrypted-data-vaults/v1",
  "id": "https://example.com/",
  "name": "Example Website",
  "dataVaultCreationService": "https://example.com/data-vaults"
}
        </pre>

      </section>

      <section class="normative">
        <h2>
  Creating a data vault
      </h2>

        <p>
A data vault is created by performing an HTTP POST of a
<a href="#datavaultconfiguration">DataVaultConfiguration</a>
to the <code>dataVaultCreationService</code>. The following HTTP
status codes are defined for this service:
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">HTTP Status</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>201</td>
              <td>
data vault creation was successful. The HTTP <code>Location</code> header will
contain the URL for the newly created data vault.
              </td>
            </tr>
            <tr>
              <td>400</td>
              <td>
data vault creation failed.
              </td>
            <tr>
            </tr>
              <td>409</td>
              <td>
A duplicate data vault exists.
              </td>
            </tr>
          </tbody>
        </table>

        <p>
An example exchange of a data vault creation is shown below:
        </p>

        <pre class="example highlight" title="data vault creation request">
POST /data-vaults HTTP/1.1
Host: example.com
Content-Type: application/json
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate

{
  "sequence": 0,
  "controller": "did:example:123456789",
  "referenceId": "urn:uuid:abc5a436-21f9-4b4c-857d-1f5569b2600d",
  "keyAgreementKey": {
    "id": "https://example.com/kms/12345",
    "type": "X25519KeyAgreementKey2019"
  },
  "hmac": {
    "id": "https://example.com/kms/67891",
    "type": "Sha256HmacKey2019"
  }
}
        </pre>

        <p class="issue">
Explain the purpose of the controller property to root authority. Explain how
Authorization Capabilities can be created and invoked via HTTP signatures to
authorize reading and writing from/to data vaults.
        </p>

        <p>
If the creation of the data vault was successful, an HTTP 201 status code is
expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault creation response">
HTTP/1.1 201 Created
Location: https://example.com/encrypted-data-vaults/z4sRgBJJLnYy
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Date: Fri, 14 Jun 2019 18:35:33 GMT
Connection: keep-alive
Transfer-Encoding: chunked
        </pre>

      </section>

      <section class="normative">
        <h2>
  Creating a Document
      </h2>

        <p>
A structured document is stored in a data vault by encoding a
<a>StructuredDocument</a> as an <a>EncryptedDocument</a> and then performing
an HTTP POST to a data vault endpoint created via
<a href="#creating-a-data-vault"></a>.
The following HTTP status codes are defined for this service:
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">HTTP Status</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>201</td>
              <td>
Structured document creation was successful. The HTTP <code>Location</code>
header will contain the URL for the newly created document.
              </td>
            </tr>
            <tr>
              <td>400</td>
              <td>
Structured document creation failed.
              </td>
            </tr>
          </tbody>
        </table>

        <p>
In order to convert a <a>StructuredDocument</a> to an
<a>EncryptedDocument</a> an implementer MUST encode the
<a>StructuredDocument</a> as a JWE or a COSE Encrypted object. Once the
document is encrypted, it can be sent to the document creation service.
        </p>

        <p>
A protocol example of a document creation is shown below:
        </p>

        <pre class="example highlight" title="data vault document creation request">
POST /encrypted-data-vaults/z4sRgBJJLnYy/docs HTTP/1.1
Host: example.com
Content-Type: application/json
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate

{
  "id": "urn:uuid:94684128-c42c-4b28-adb0-aec77bf76044",
  "sequence": 0,
  "jwe": {
    "protected": "eyJlbmMiOiJDMjBQIn0",
    "recipients": [{
      "header": {
        "alg": "A256KW",
        "kid": "https://example.com/kms/zSDn2MzzbxmX"
      },
      "encrypted_key": "OR1vdCNvf_B68mfUxFQVT-vyXVrBembuiM40mAAjDC1-Qu5iArDbug"
    }],
    "iv": "i8Nins2vTI3PlrYW",
    "ciphertext": "Cb-963UCXblINT8F6MDHzMJN9EAhK3I",
    "tag": "pfZO0JulJcrc3trOZy8rjA"
  }
}
        </pre>

        <p>
If the creation of the structured document was successful, an HTTP 201 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault document creation response">
HTTP/1.1 201 Created
Location: https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/docs/zMbxmSDn2Xzz
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Date: Fri, 14 Jun 2019 18:37:12 GMT
Connection: keep-alive
Transfer-Encoding: chunked
        </pre>

      </section>

      <section class="normative">
        <h2>
  Reading a Document
      </h2>

        <p>
Reading a document from a data vault is performed by retrieving the
<a>EncryptedDocument</a> and then decrypting it to a
<a>StructuredDocument</a>. The following HTTP status codes are defined for
this service:
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">HTTP Status</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>200</td>
              <td>
EncryptedDocument retrieval was successful.
              </td>
            </tr>
            <tr>
              <td>400</td>
              <td>
EncryptedDocument retrieval failed.
              </td>
            </tr>
            <tr>
              <td>404</td>
              <td>
EncryptedDocument with given id was not found.
              </td>
            </tr>
          </tbody>
        </table>

        <p>
In order to convert an <a>EncryptedDocument</a> to a
<a>StructuredDocument</a> an implementer MUST decode the
<a>EncryptedDocument</a> from a JWE or a COSE Encrypted object. Once the
document is decrypted, it can be processed by the web application.
        </p>

        <p>
A protocol example of a document retrieval is shown below:
        </p>

        <p class="issue">
Explain that the URL path structure is fixed for all data vaults to enable
portability and the use of stable URLs (such as through DID URLs) to reference
certain documents while allowing users to change their data vault service
providers. Explain how this enables portability.
        </p>

        <pre class="example highlight" title="data vault encrypted document retrieval">
GET https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/docs/zMbxmSDn2Xzz HTTP/1.1
Host: example.com
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate
        </pre>

        <p>
If the retrieval of the encrypted document was successful, an HTTP 200 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault document read response">
HTTP/1.1 200 OK
Date: Fri, 14 Jun 2019 18:37:12 GMT
Connection: keep-alive

{
  "id": "urn:uuid:94684128-c42c-4b28-adb0-aec77bf76044",
  "sequence": 0,
  "jwe": {
    "protected": "eyJlbmMiOiJDMjBQIn0",
    "recipients": [{
      "header": {
        "alg": "A256KW",
        "kid": "https://example.com/kms/zSDn2MzzbxmX"
      },
      "encrypted_key": "OR1vdCNvf_B68mfUxFQVT-vyXVrBembuiM40mAAjDC1-Qu5iArDbug"
    }],
    "iv": "i8Nins2vTI3PlrYW",
    "ciphertext": "Cb-963UCXblINT8F6MDHzMJN9EAhK3I",
    "tag": "pfZO0JulJcrc3trOZy8rjA"
  }
}
        </pre>

      </section>

      <section class="normative">
        <h2>
  Updating a Document
      </h2>

        <p>
A structured document is updated in a data vault by encoding the updated
<a>StructuredDocument</a> as an <a>EncryptedDocument</a> and then performing
an HTTP POST to a data vault endpoint created via
<a href="#creating-a-data-vault"></a>.
The following HTTP status codes are defined for this service:
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">HTTP Status</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>200</td>
              <td>
Structured document update was successful.
              </td>
            </tr>
            <tr>
              <td>400</td>
              <td>
Structured document update failed.
              </td>
            </tr>
          </tbody>
        </table>

        <p>
In order to convert a <a>StructuredDocument</a> to an
<a>EncryptedDocument</a> an implementer MUST encode the
<a>StructuredDocument</a> as a JWE or a COSE Encrypted object. Once the
document is encrypted, it can be sent to the document creation service.
        </p>

        <p>
A protocol example of a document update is shown below:
        </p>

        <pre class="example highlight" title="data vault document update request">
POST  /encrypted-data-vaults/z4sRgBJJLnYy/docs/zMbxmSDn2Xzz HTTP/1.1
Host: example.com
Content-Type: application/json
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate

{
  "id": "urn:uuid:94684128-c42c-4b28-adb0-aec77bf76044",
  "sequence": 1,
  "jwe": {
    "protected": "eyJlbmMiOiJDMjBQIn0",
    "recipients": [{
      "header": {
        "alg": "A256KW",
        "kid": "https://example.com/kms/zSDn2MzzbxmX"
      },
      "encrypted_key": "OR1vdCNvf_B68mfUxFQVT-vyXVrBembuiM40mAAjDC1-Qu5iArDbug"
    }],
    "iv": "i8Nins2vTI3PlrYW",
    "ciphertext": "Cb-963UCXblINT8F6MDHzMJN9EAhK3I",
    "tag": "pfZO0JulJcrc3trOZy8rjA"
  }
}
        </pre>

        <p>
If the update to the encrypted document was successful, an HTTP 200 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault document creation response">
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store, must-revalidate
Date: Fri, 14 Jun 2019 18:39:52 GMT
Connection: keep-alive
        </pre>

      </section>

      <section class="normative">
        <h2>
  Deleting a Document
      </h2>

        <p>
A structured document is deleted by performing an HTTP DELETE to a data vault
endpoint created via <a href="#creating-a-data-vault"></a>.
The following HTTP status codes are defined for this service:
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">HTTP Status</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>200</td>
              <td>
Structured document was deleted successfully.
              </td>
            </tr>
            <tr>
              <td>400</td>
              <td>
Structured document deletion failed.
              </td>
            </tr>
            <tr>
              <td>404</td>
              <td>
Structured document was not found.
              </td>
            </tr>
          </tbody>
        </table>

        <p>
A protocol example of a document deletion is shown below:
        </p>

        <pre class="example highlight" title="data vault document deletion request">
DELETE  /encrypted-data-vaults/z4sRgBJJLnYy/docs/zMbxmSDn2Xzz HTTP/1.1
Host: example.com
        </pre>

        <p>
If the deletion of the encrypted document was successful, an HTTP 200 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault document deletion response">
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store, must-revalidate
Date: Fri, 14 Jun 2019 18:40:18 GMT
Connection: keep-alive
        </pre>
      </section>

      <section class="normative">
        <h2>
  Creating a Stream
      </h2>

        <p class="issue">
This section is out of date, do not implement.
        </p>

        <p class="issue">
Another design is being considered that would transform streams into a
single index document and a collection of documents, each of which contains
a chunk of the stream. This would be done to help prevent misuse of a
decryption stream prior to its authentication. In order for this approach to
be implemented in a Web browser, it also requires certain File or Blob APIs.
Further investigation is needed to ensure that support of these APIs would be
sufficient for this design approach, as it would be preferred to prevent data
misuse and to make better use of native implementations of authenticated
encryption modes.
        </p>

        <p>
A stream is stored in a data vault by writing a document containing metadata
about the stream, encrypting the stream, writing it to a data vault, and then
updating the document containing metadata about the stream. The following
HTTP status codes are defined for this service:
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">HTTP Status</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>201</td>
              <td>
Stream creation was successful. The HTTP <code>Location</code>
header will contain the URL for the newly created stream.
              </td>
            </tr>
            <tr>
              <td>400</td>
              <td>
Stream creation failed.
              </td>
            </tr>
          </tbody>
        </table>

        <p>
Implementations first encode the metadata associated with the stream into a
<a>StructuredDocument</a>:
        </p>

        <pre class="example highlight" title="StructuredDocument associated with a stream">
{
  "id": "urn:uuid:94684128-c42c-4b28-adb0-aec77bf76044",
  "meta": {
    "created": "2019-06-18",
    "contentType": "video/mpeg",
    "contentLength": 56735817
  },
  "content": {
    "id": "https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/streams/zMbxmSDn2Xzz"
  }
}
        </pre>

        <p class="note">
In this case, the value of <code>content.id</code> is a reference to the
stream located at
<code>https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/streams/zMbxmSDn2Xzz</code>,
which is the location that the stream MUST be written to. This content
identifier MUST be updated to include a hashlink once the stream has been
written and its digest is known.
        </p>

        <p>
The <a>StructuredDocument</a> above is then transformed to an
<a>EncryptedDocument</a> and the procedure in
<a href="#creating-a-document"></a> is executed:
        </p>

        <pre class="example highlight" title="Encrypted document creation request for stream metadata">
POST /encrypted-data-vaults/z4sRgBJJLnYy/docs HTTP/1.1
Host: example.com
Content-Type: application/json
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate

{
  "id": "urn:uuid:94684128-c42c-4b28-adb0-aec77bf76044",
  "sequence": 0,
  "jwe": {
    "protected": "eyJlbmMiOiJDMjBQIn0",
    "recipients": [{
      "header": {
        "alg": "A256KW",
        "kid": "https://example.com/kms/zSDn2MzzbxmX"
      },
      "encrypted_key": "OR1vdCNvf_B68mfUxFQVT-vyXVrBembuiM40mAAjDC1-Qu5iArDbug"
    }],
    "iv": "i8Nins2vTI3PlrYW",
    "ciphertext": "Cb-963UCXblINT8F6MDHzMJN9EAhK3I",
    "tag": "pfZO0JulJcrc3trOZy8rjA"
  }
}
        </pre>

        <p>
If the creation of the structured document was successful, an HTTP 201 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault document creation response">
HTTP/1.1 201 Created
Location: https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/docs/zp4H8ekWn
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Date: Fri, 14 Jun 2019 18:37:12 GMT
Connection: keep-alive
Transfer-Encoding: chunked
        </pre>

        <p>
Next, in order to convert a stream to an <a>EncryptedStream</a> an implementer
MUST encrypt the stream. Once the stream is encrypted (or as it is encrypted),
it can be sent to the stream creation service.
        </p>

        <p>
A protocol example of a stream creation is shown below:
        </p>

        <pre class="example highlight" title="Encrypted stream creation request">
POST /encrypted-data-vaults/z4sRgBJJLnYy/streams HTTP/1.1
Host: example.com
Content-Type: application/octet-stream
Transfer-Encoding: chunked
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate

TBD
        </pre>

        <p>
If the creation of the stream was successful, an HTTP 201 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful stream creation response">
HTTP/1.1 201 Created
Location: https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/streams/zMbxmSDn2Xzz
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Date: Fri, 14 Jun 2019 18:37:12 GMT
Connection: keep-alive
Transfer-Encoding: chunked
        </pre>

        <p>
Once a stream is created, the metadata related to the stream can be updated
in the data vault using the protocol defined in
<a href="#updating-a-document"></a>. An example of updating a link to a video
file is shown below.
        </p>

        <p>
Implementations update the metadata associated with the stream in its
<a>StructuredDocument</a>:
        </p>

        <pre class="example highlight" title="Updating StructuredDocument associated with a stream">
{
  "id": "urn:uuid:94684128-c42c-4b28-adb0-aec77bf76044",
  "sequence": 1,
  "meta": {
    "created": "2019-06-18",
    "contentType": "video/mpeg",
    "contentLength": 56735817
  },
  "content": {
    "id": "https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/streams/zMbxmSDn2Xzz?hl=zb47JhaKJ3hJ5Jkw8oan35jK23289Hp",
    "jwe": {
      "protected": "eyJlbmMiOiJDMjBQIn0",
      "recipients": [{
        "header": {
          "alg": "A256KW",
          "kid": "https://example.com/kms/zSDn2MzzbxmX"
        },
        "encrypted_key": "OR1vdCNvf_B68mfUxFQVT-vyXVrBembuiM40mAAjDC1-Qu5iArDbug"
      }],
      "iv": "i8Nins2vTI3PlrYW",
      "tag": "pfZO0JulJcrc3trOZy8rjA"
    }
  }
}
        </pre>

        <p class="note">
The value of <code>content.id</code> MUST be updated to include a hashlink
now that the stream has been written and its digest is known.
        </p>

        <p>
The <a>StructuredDocument</a> above is then transformed to an
<a>EncryptedDocument</a> and the procedure in
<a href="#updating-a-document"></a> is executed:
        </p>

        <pre class="example highlight" title="Encrypted document update request for stream metadata">
POST /encrypted-data-vaults/z4sRgBJJLnYy/docs HTTP/1.1
Host: example.com
Content-Type: application/json
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate

{
  "id": "urn:uuid:94684128-c42c-4b28-adb0-aec77bf76044",
  "sequence": 1,
  "jwe": {
    "protected": "eyJlbmMiOiJDMjBQIn0",
    "recipients": [{
      "header": {
        "alg": "A256KW",
        "kid": "https://example.com/kms/zSDn2MzzbxmX"
      },
      "encrypted_key": "OR1vdCNvf_B68mfUxFQVT-vyXVrBembuiM40mAAjDC1-Qu5iArDbug"
    }],
    "iv": "i8Nins2vTI3PlrYW",
    "ciphertext": "Cb-963UCXblINT8F6MDHzMJN9EAhK3I",
    "tag": "pfZO0JulJcrc3trOZy8rjA"
  }
}
        </pre>

        <p>
If the creation of the structured document was successful, an HTTP 200 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault document creation response">
HTTP/1.1 200 OK
Location: https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/docs/zp4H8ekWn
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Date: Fri, 14 Jun 2019 18:37:12 GMT
Connection: keep-alive
Transfer-Encoding: chunked
        </pre>

      </section>

      <section class="normative">
        <h2>
  Reading a Stream
      </h2>

        <p class="issue">
This section is out of date, do not implement.
        </p>

        <p>
Reading a stream from a data vault is performed by retrieving the associated
metadata that is encrypted as an <a>EncryptedDocument</a>, decoding the
hashlink information, and then retrieving the <a>EncryptedStream</a> and then
decrypting it. The following HTTP status codes are defined for this service:
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">HTTP Status</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>200</td>
              <td>
Encrypted stream retrieval was successful.
              </td>
            </tr>
            <tr>
              <td>400</td>
              <td>
Encrypted stream retrieval failed.
              </td>
            </tr>
            <tr>
              <td>404</td>
              <td>
Encrypted stream with given id was not found.
              </td>
            </tr>
          </tbody>
        </table>

        <p>
In order to convert an <a>EncryptedStream</a> to a stream
an implementer MUST decode the <a>EncryptedStream</a> using the information
provided in the associated <a>EncryptedDocument</a>. Once the stream is
decrypted, it can be processed by the web application.
        </p>

        <p class="note">
Implementers can perform random seeking in the stream by using the
<code>Content-Range</code> HTTP Header.
        </p>

        <p>
A protocol example of a stream retrieval is shown below:
        </p>

        <pre class="example highlight" title="data vault encrypted stream retrieval">
GET https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/streams/zn2XmSDzMbxz HTTP/1.1
Host: example.com
Content-Range: 0-1048576
Accept: application/octet-stream
Accept-Encoding: gzip, deflate
        </pre>

        <p>
If the retrieval of the encrypted stream was successful, an HTTP 200 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault stream read response">
HTTP/1.1 200 OK
Date: Fri, 14 Jun 2019 18:37:12 GMT
Content-Range: 0-1048576
Content-Length: 1048576
Connection: keep-alive

...
        </pre>

      </section>

      <section class="normative">
        <h2>
  Deleting a Stream
      </h2>

        <p class="issue">
This section is out of date, do not implement.
        </p>

        <p>
A stream is deleted by performing an HTTP DELETE to a data vault
stream endpoint created via <a href="#creating-a-stream"></a> and
the corresponding metadata document created via
<a href="#creating-a-document"></a>. The following HTTP status codes are
defined for this service:
        </p>

        <table class="simple">
          <thead>
            <th style="white-space: nowrap">HTTP Status</th>
            <th>Description</th>
          </thead>
          <tbody>
            <tr>
              <td>200</td>
              <td>
Stream was deleted successfully.
              </td>
            </tr>
            <tr>
              <td>400</td>
              <td>
Stream deletion failed.
              </td>
            </tr>
            <tr>
              <td>404</td>
              <td>
Stream was not found.
              </td>
            </tr>
          </tbody>
        </table>

        <p>
A protocol example of a stream deletion is shown below:
        </p>

        <pre class="example highlight" title="data vault stream deletion request">
DELETE  /encrypted-data-vaults/z4sRgBJJLnYy/streams/zMbxmSDn2Xzz HTTP/1.1
Host: example.com
</pre>
        </pre>

        <p>
If the deletion of the encrypted stream was successful, an HTTP 200 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault stream deletion response">
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store, must-revalidate
Date: Fri, 14 Jun 2019 18:40:18 GMT
Connection: keep-alive
        </pre>

        <p>
Once the stream is deleted, implementations MUST also delete the corresponding
metadata document:
        </p>

        <pre class="example highlight" title="data vault stream deletion request">
DELETE  /encrypted-data-vaults/z4sRgBJJLnYy/docs/zMbxmSDn2Xzz HTTP/1.1
Host: example.com
        </pre>

        <p>
If the deletion of the encrypted stream was successful, an HTTP 200 status
code is expected in return:
        </p>

        <pre class="example highlight" title="Successful data vault stream deletion response">
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store, must-revalidate
Date: Fri, 14 Jun 2019 18:40:18 GMT
Connection: keep-alive
        </pre>

      </section>

      <section class="normative">
        <h2>
  Creating Encrypted Indexes
      </h2>

        <p>
It is often useful to search a data vault for structured documents that contain
specific metadata. Efficient searching requires the use of search indexes
and local access to data. This poses an interesting challenge as the
search has to be performed on the <a>storage provider</a> without leaking
information that could violate the privacy of the entities that are storing
information in the data vault. This section details how encrypted indexes
can be created and used to perform efficient searching while protecting
the privacy of entities that are storing information in the data vault.
        </p>

        <p>
When creating an <a>EncryptedDocument</a>, blinded index properties MAY
be used to perform efficient searches. An example of the use of these
properties is shown below:
        </p>

        <pre class="example highlight" title="Example encrypted document with encrypted indexes">
{
  "id": "urn:uuid:698f3fb6-592f-4d22-9e04-462cc4606a23",
  "sequence": 0,
  "indexed": [{
    "sequence": 0,
    "hmac": {
      "id": "https://example.com/kms/z7BgF536GaR",
      "type": "Sha256HmacKey2019"
    },
    "attributes": [{
      "name": "CUQaxPtSLtd8L3WBAIkJ4DiVJeqoF6bdnhR7lSaPloZ",
      "value": "RV58Va4904K-18_L5g_vfARXRWEB00knFSGPpukUBro",
      "unique": true
    }, {
      "name": "DUQaxPtSLtd8L3WBAIkJ4DiVJeqoF6bdnhR7lSaPloZ",
      "value": "QV58Va4904K-18_L5g_vfARXRWEB00knFSGPpukUBro"
    }]
  }],
  "jwe": {
    "protected": "eyJlbmMiOiJDMjBQIn0",
    "recipients": [
      {
        "header": {
          "alg": "A256KW",
          "kid": "https://example.com/kms/z7BgF536GaR"
        },
        "encrypted_key":
          "OR1vdCNvf_B68mfUxFQVT-vyXVrBembuiM40mAAjDC1-Qu5iArDbug"
      }
    ],
    "iv": "i8Nins2vTI3PlrYW",
    "ciphertext": "Cb-963UCXblINT8F6MDHzMJN9EAhK3I",
    "tag": "pfZO0JulJcrc3trOZy8rjA"
  }
}
        </pre>

        <p class="note">
The example above demonstrates the use of unique index values as well as
non-unique indexes.
        </p>

        <p>
The example above enables the <a>storage provider</a> to build
efficient indexes on encrypted properties while enabling
<a>storage agents</a> to search the information without leaking information
that would create privacy concerns.
        </p>

        <p class="issue">
Provide instructions and examples for how indexes are blinded using an
HMAC key.
        </p>

        <p class="issue">
Explain that multiple entities can maintain their own independent indexes
(using their own HMAC key) provided they have been granted this capability.
Explain that indexes can be sparse/partial. Explain that indexes have their
own sequence number and that it will match the document's sequence number
once it is updated.
        </p>

        <p class="issue">
Add a section showing the update index endpoint and how it works.
        </p>

      </section>

      <section class="normative">
        <h2>
  Searching Encrypted Documents
      </h2>

        <p>
The contents of a data vault can be searched using encrypted indexes created
using the processes described in <a href="#creating-encrypted-indexes"></a>.
There are two primary ways of searching for encrypted documents. The first
is to search for a specific value associated with a specific index. The
second is to search to see if a specific index exists on a document.
        </p>

        <p>
The example below demonstrates how to search for a specific value associated
with a specific index.
        </p>

        <pre class="example highlight" title="data vault query for a specifically indexed value">
POST https://example.com/encrypted-data-vaults/z4sRgBJJLnYy HTTP/1.1
Host: example.com
Content-Type: application/json
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate

{
  "index": "DUQaxPtSLtd8L3WBAIkJ4DiVJeqoF6bdnhR7lSaPloZ",
  "equals": [
    {"QV58Va4904K-18_L5g_vfARXRWEB00knFSGPpukUBro":
      "dh327d234h8437hc34f43f43ZXGHDXG"}
  ]
}
        </pre>

        <p>
A successful query will result in a standard HTTP 200 response with a list of
identifiers for all encrypted documents that match the query:
        </p>

        <pre class="example highlight" title="Successful query response from a data vault">
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store, must-revalidate
Date: Fri, 14 Jun 2019 18:45:18 GMT
Connection: keep-alive

["https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/docs/zMbxmSDn2Xzz"]
        </pre>

        <p>
The contents of a data vault can also be searched to see if a certain attribute
name is indexed by using the <code>has</code> keyword.
        </p>

        <pre class="example highlight" title="data vault query for a particular attribute name">
POST https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/queries HTTP/1.1
Host: example.com
Content-Type: application/json
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate

{
  "has": ["CUQaxPtSLtd8L3WBAIkJ4DiVJeqoF6bdnhR7lSaPloZ"]
}
        </pre>

        <p>
If the query above is successful, an HTTP 200 code is expected with a list
of <a>EncryptedDocument</a> identifiers that contain the value.
        </p>

        <pre class="example highlight" title="Successful query response for a specific attribute name">
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store, must-revalidate
Date: Fri, 14 Jun 2019 18:45:18 GMT
Connection: keep-alive

["https://example.com/encrypted-data-vaults/z4sRgBJJLnYy/docs/zMbxmSDn2Xzz"]
        </pre>

      </section>

    </section>

  <section class="informative">
    <h2>Extension points
          </h2>
    <p>
Encrypted Data Vaults support a number of extension points:
    </p>
    <ul>
      <li>
Protocol/API - One or more protocols such as library APIs, HTTPS, gRPC, or Bluetooth can be used to access the system.
      </li>
      <li>
Encryption Strategies - One or more encryption strategies such as AES-GCM or XSalsa20Poly1305 can be used to encrypt data.
      </li>
      <li>
Authorization Strategies - One or more authorization strategies such as
OAuth2, HTTP Signatures, or Authorization Capabilities can be used to protect
access to encrypted data.
      </li>
      <li>
Versioning Strategies and Replication Strategies - One or more versioning and replication strategies such as counters, cryptographic hashes, or CRDTs
(Conflict-free Replicated Data Types) can be used to synchronize data.
      </li>
      <li>
Notification mechanisms - One or more notification mechanisms can be used to
signal to clients that data has changed in the vault.
      </li>
    </ul>
  </section>

  <section class="informative">
    <h2>Privacy Considerations
          </h2>

    <p>
This section details the general privacy considerations and specific privacy
implications of deploying this specification into production environments.
    </p>

    <p class="issue">
Write privacy considerations.
    </p>

  </section>

  <section class="informative">
    <h2>
Security Considerations
    </h2>

    <p>
There are a number of security considerations that implementers should be
aware of when processing data described by this specification. Ignoring or
not understanding the implications of this section can result in
security vulnerabilities.
    </p>

    <p>
While this section attempts to highlight a broad set of security
considerations, it is not a complete list. Implementers are urged to seek the
advice of security and cryptography professionals when implementing mission
critical systems using the technology outlined in this specification.
    </p>

    <section>
      <h3>
Malicious or accidental modification of data
      </h3>
      <p>
While a service provider is not able to read data in an
Encrypted Data Vault, it is possible for a service provider to delete, add,
or modify encrypted data. The deletion, addition, or modification of encrypted
data can be prevented by keeping a global manifest of data in the data vault.
      </p>
    </section>

    <section>
      <h3>
Compromised vault
      </h3>
      <p>
An Encrypted Data Vault can be compromised if the data controller (the entity
who holds the decryption keys and appropriate authorization credentials)
accidentally grants access to an attacker. For example, a victim might
accidentally authorize an attacker to the entire vault or mishandle their
encryption key. Once an attacker has access to the system, they may modify,
remove, or change the vault's configuration.
      </p>
    </section>

    <section>
      <h3>
Data access timing attacks
      </h3>
      <p>
While it is normally difficult for a server to determine the identity of an
entity as well as the purpose for which that entity is accessing the
Encrypted Data Vault, there is always metadata related to access patterns,
rough file sizes, and other information that is leaked when an entity accesses
the vault. The system has been designed to not leak information that it creates
concerning privacy limitations, an approach that protects against many, but
not all, surveillance strategies that may be used by servers that are not
acting in the best interest of the privacy of the vault's users.
      </p>
    </section>

    <section>
      <h3>Encrypted data on public networks
              </h3>
      <p>
Assuming that all encryption schemes will eventually be broken is a safe
assumption to make when protecting one's data. For this reason, it is
inadvisable that servers use any sort of public storage network to store
encrypted data as a storage strategy.
      </p>
    </section>

    <section>
      <h3>Unencrypted data on server
              </h3>
      <p>
While this system goes to great lengths to encrypt content and metadata,
there are a handful of fields that cannot be encrypted in order to ensure
the server can provide the features outlined in this specification.
For example, a version number associated with data provides insight into
how often the data is modified. The identifiers associated with
encrypted content enables a server to gain knowledge by possibly correlating
identifiers across documents. Implementations are advised to minimize the
amount of information that is stored in an unencrypted fashion.
      </p>
    </section>

    <section>
      <h3>Partial matching on encrypted indexes
              </h3>
      <p>
The encrypted indexes used by this system are designed to maximize privacy.
As a result, there are a number of operations that are common in search
systems that are not available with encrypted indexes, such as partial
matching on encrypted text fields or searches over a scalar range. These
features might be added in the future through the use of zero-knowledge
encryption schemes.
      </p>
    </section>

    <section>
      <h3>
Threat model for malicious service provider
      </h3>
      <p>
While it is expected that most service providers are not malicious, it is
also important to understand what a malicious service provider can and
cannot do. The following attacks are possible given a malicious service
provider:
      </p>
      <ul>
        <li>
Correlation of entities accessing information in a vault
        </li>
        <li>
Speculation about the types of files stored in a vault depending on file size and access patterns
        </li>
        <li>
Addition, deletion, and modification of encrypted data
        </li>
        <li>
Not enforcing authorization policy set on the encrypted data
        </li>
        <li>
Exfiltrating encrypted data to an unknown external system
        </li>
      </ul>
    </section>

  </section>

    <section class="informative">
      <h2>
Accessibility Considerations
      </h2>

      <p>
There are a number of accessibility considerations implementers should be
aware of when processing data described in this specification. As with any web
standards or protocols implementation, ignoring accessibility issues makes
this information unusable to a large subset of the population. It is important
to follow accessibility guidelines and standards, such as [[WCAG21]], to ensure
all people, regardless of ability, can make use of this data. This is
especially important when establishing systems using cryptography, which
have historically created problems for assistive technologies.
      </p>

      <p>
This section details the general accessibility considerations to take into
account when using this data model.
      </p>

      <p class="issue">
Write accessibility considerations.
      </p>

    </section>

    <section class="informative">
      <h2>
Internationalization Considerations
      </h2>

      <p>
There are a number of internationalization considerations implementers should be
aware of when publishing data described in this specification. As with any web
standards or protocols implementation, ignoring internationalization makes it
difficult for data to be produced and consumed across a disparate set of
languages and societies, which would limit the applicability of the
specification and significantly diminish its value as a standard.
      </p>

      <p>
This section outlines general internationalization considerations to take into
account when using this data model.
      </p>

      <p class="issue">
Write i18n considerations.
      </p>

    </section>

    <section class="informative appendix">
      <h2>
Identity Hubs
      </h2>
      <p>
Hubs let you securely store and share data. A Hub is a datastore containing semantic data objects at well-known locations.  Each object in a Hub is signed by an identity and accessible via a globally recognized API format that explicitly maps to semantic data objects.  Hubs are addressable via unique identifiers maintained in a global namespace.
      </p>
      <section>
        <h3>
One DID to Many Hub Instances
        </h3>
        <p>
A single entity may have one or more instances of a Hub, all of which are addressable via a URI routing mechanism linked to the entity's identifier.  Hub instances sync state changes, ensuring the owner can access data and attestations from anywhere, even when offline.
        </p>
        <section>
          <h4>
DID Document Service Endpoint Descriptors
          </h4>
          <p>
There are two different variations of Hub-specific DID Document Service Endpoint descriptors, one that users associate with their DIDs, and another that Hosts use to direct requests to locations where their Hub infrastructure resides.
          </p>
          <p>
Users specify their Hub instances (different Hub Hosts) via the <code>UserServiceEndpoint</code> descriptor:
          </p>
          <pre class="example">
  "service": [{
  "type": "IdentityHub",
  "publicKey": "did:foo:123#key-1",
  "serviceEndpoint": {
    "@context": "schema.identity.foundation/hub",
    "@type": "UserServiceEndpoint",
    "instances": ["did:bar:456", "did:zaz:789"]
  }
}]
          </pre>
          <p>
Hosts specify the locations of their Hub offerings via the <code>HostServiceEndpoint</code> descriptor:
          </p>
          <pre class="example">
"service": [{
  "type": "IdentityHub",
  "publicKey": "did:bar:456#key-1",
  "serviceEndpoint": {
    "@context": "schema.identity.foundation/hub",
    "@type": "HostServiceEndpoint",
    "locations": ["https://hub1.bar.com/.identity", "https://hub2.bar.com/blah/.identity"]
  }
}]
          </pre>
        </section>
        <section>
          <h4>
Syncing data between Hubs
          </h4>
          <p>
Hub instances must sync data without requiring master-slave relationships or forcing a single implementation for storage or application logic. This requires a shared replication protocol for broadcasting and resolving changes. The protocol for reproducing Hub state across multiple instances is in the formative phases of definition/selection, but should be relatively straightforward to integrate on top of any NoSQL datastore.
          </p>
        </section>
        <section>
          <h4>
Hub data serialization and export
          </h4>
          <p>
All Hubs must support the export of their serialized state. This is to ensure the user retains full control over the portability of their data. A later revision to this document will specify the process for invoking this intent and retrieving the serialized data from a Hub instance.
          </p>
        </section>
      </section>
      <section>
        <h3>
Hub Protocol Schemes
        </h3>
        <section>
          <h4>
Hub URI Scheme
          </h4>
          <p>
In addition to the URL path convention for individual Hubs instances, it is important that links to an identity owner's data not be encoded with a dependency on a specific Hub instance. To make this possible, we propose the introduction of the following Hub URI scheme:
          </p>
          <pre class="example">
hub://did:foo:123abc/
          </pre>
          <p>
User Agents that understand this scheme will leverage the Universal Resolver to lookup the Hub instances of the target DID and address the Hub endpoints via the Service Endpoints it finds within. This allows the formation of URIs that are not Hub instance-specific, allowing a more natural way to link to a DID's data, without having to specify a specific instance. This also prevents the circulation of dead links across the Web, given an identity owner can choose to add/remove new Hub instances at any time.
          </p>
        </section>
      </section>
      <section>
        <h3>
Authentication
        </h3>
        <p>
The process of authenticating requests to identity hubs will follow the DIF/W3C DID Auth schemes. These standards are in early phases of formation - <a href="https://github.com/WebOfTrustInfo/rebooting-the-web-of-trust-spring2018/blob/master/draft-documents/did_auth_draft.md">more info is available here</a>.
        </p>

        <p>
The current Identity Hub authentication scheme seeks to accomplish two tasks:
        </p>
        <ul>
          <li>
mutually authenticate the client and the Hub using each's respective DID and associated keys.
          </li>
          <li>
encrypt all Hub requests and responses such that their contents are only available to the holders of the DID keys involved in the message exchange.
          </li>
        </ul>
        <p>
The current authentication scheme is an implementation of DID Auth, <a
href="https://gitHub.com/WebOfTrustInfo/rwot6-santabarbara/blob/master/final-documents/did-auth.md">as
described here</a>. This document will outline how to authenticate Hub requests
and responses. For full details on the authentication protocol used, and for a
reference implementation of the protocol, please refer to the <a
href="https://gitHub.com/decentralized-identity/did-auth-jose/blob/master/docs/Authentication.md">`did-auth-jose`
library</a>.
        </p>

        <section>
          <h4>
Authenticating a Hub request
          </h4>
          <p>
Identity Hub requests and responses are signed and encrypted using the DID keys
of the sender and the recipient. This protects the message over any
transportation medium. All encrypted requests and responses follow the
<a href="https://tools.ietf.org/html/rfc7516">JSON Web Encryption (JWE)
standard</a>.
          </p>
          <p>
The steps to construct a JWE are as follows. First, construct a JWT. This JWT
will be signed by the sender of the Hub request; the `iss`. This JWT must have
the following form:
          </p>

          <pre class="example">
// JWT headers
{
  "alg": "RS256",
  "kid": "did:example:abc123#key-abc",
  "did-requester-nonce": "randomized-string",
  "did-access-token": "eyJhbGciOiJSUzI1N...",
}

// JWT body
{
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "WriteRequest",
  "iss": "did:example:abc123",
  ...
}

// JWT signature
uQRqsaky-SeP3m9QPZmTGtRtMoKzyg6wwWF...
          </pre>
          <p>
The JWT body is just the request whose format is outlined in
<a href="#identity-hubs"></a>. The header values must be the following:
          </p>

          <table class="simple">
            <tr>
              <th>
Header
              </th>
              <th>
Description
              </th>
            </tr>
            <tr>
              <td>
`alg`
              </td>
              <td>
Standard JWT header. Indicates the algorithm used to sign the JWT.
              </td>
            </tr>
            <tr>
              <td>
`kid`
              </td>
              <td>
Standard JWT header. The value should take the form `{did}#{key-id}`. Another
app can take this value, resolve the DID, and find the indicated public key that
can be used for signature validation of the commit. Here we have used
`did:example:abc123`, because the request is signed with the user's private key.
              </td>
            </tr>
            <tr>
              <td>
`did-requester-nonce`
              </td>
              <td>
A randomly generated string that must be cached on the client side. This string
will be used to verify the response from the Hub in the sections below.
              </td>
            </tr>
            <tr>
              <td>
`did-access-token`
              </td>
              <td>
A token that should be cached on the client side and included in each request
sent to the Hub. Since we do not have an access token yet, leave this property
out on the initial request. Sections below explain how to get an access token.
              </td>
            </tr>
          </table>
          <p>
This JWT must use the typical JWT compact serialization format.
          </p>
          <p>
We can now use this JWT as the plaintext used to construct our JWE. The JWE must
have the following format.
          </p>

          <pre class="example">
// JWE protected header
{
  "alg": "RSA-OAEP-256",
  "kid": "did:example:abc456#abc-123",
  "enc": "A128GCM",
}

// JWE encrypted content encryption key
OKOawDo13gRp2ojaHV7LFpZcgV7T6DVZKTyKOM...

// JWE initialization vector
48V1_ALb6US04U3b...

// JWE plaintext (the JWT from above)
eyJhbGciOiJSUzI1NiIsImtpZCI6InRlc3R...

// JWE authentication tag
XFBoMYUZodetZdv...
          </pre>

          <p>
We strongly reccommend using a JWT library to produce the above JWE. Using a
library, you should only need to provide the protected headers and the
plaintext. The plaintext value should be the JWT constructed above. The header
values are:
          </p>

          <table class="simple">
            <tr>
              <th>
Header
              </th>
              <th>
Description
              </th>
            </tr>
            <tr>
              <td>
`alg`
              </td>
              <td>
Standard JWT header. Indicates the algorithm used to encrypt the JWE content
encryption key.
              </td>
            </tr>
            <tr>
              <td>
`kid`
              </td>
              <td>
Standard JWT header. The value should take the form `{did}#{key-id}`. Indicates
the Hub's key that is used to encrypt the JWE content encryption key. Here we
have used `did:example:abc456`, since that is the DID used by the Hub. The DID
used here should match the `aud` value in the Hub `WriteRequest`.
              </td>
            </tr>
            <tr>
              <td>
`enc`
              </td>
              <td>
Standard JWT header. Indicates the algorithm used to encrypt the plaintext using
the content encryption key to produce the ciphertext and authentication tag.
              </td>
            </tr>
          </table>
          <p>
Finally, you have a signed and encrypted Hub request that can be transmitted to
the user's Identity Hub for secure storage.
          </p>
        </section>
        <section>
          <h4>
Caching the access token
          </h4>
          <p>
To send a successful request to an Identity Hub, you need to include an access
token in the `did-access-token` header of the JWE. The access token is a
short-lived JWT that can be used across many Hub requests until it expires.
          </p>
          <p>
On an initial request to an Identity Hub, you should exclude the
`did-access-token` header. When a Hub request does not include this header, the
Hub will reject the request. Instead, the Hub will return a JWE response (as
described in the next section) whose payload is an access token. You should
extract the access token from the response and cache it somewhere safe. The
access token can be used for subsequent requests.
          </p>
          <p>
Once you've cached the access token, include it in each request in the
`did-access-token` JWE header as described above.
          </p>
          <p>
Eventually, the access token will expire. Its expiry time can be found in the
`exp` claim inside the access token. If you attempt to use an expired access
token, the Identity Hub will return an error indicating a new access token is
required. To get a new access token, send another hub request without the
`did-access-token` header.
          </p>
        </section>
        <section>
          <h4>
Receiving a hub response
          </h4>
          <p>
When possible, a hub will respond with a JWE encrypted with the client's DID
keys:
          </p>
          <pre class="example">
eyJhbGciOiJSU0EtT0FFUCIsImVuYyI6IkEyNTZHQ00ifQ...
          </pre>
          <p>
This JWE can be decrypted with the client's private key following the JWE
standard to reproduce the response's plaintext.
          </p>
          <p>
The contents of the JWE will either be a valid hub response or a new access
token. A new access token will only be included if the `did-access-token` header
was omitted in the request.
          </p>
        </section>
      </section>
      <section>
        <h3>
Authorization
        </h3>
        <p>
Access control for data stored in Hubs is currently implemented via a bare bones
permission layer. Access to data can be granted by a Hub owner, and can be
restricted to certain types of data. More features to improve control over data
access will be added in the future.
        </p>
        <p>
The success of a decentralized identity platform is dependent upon the ability
for users to share their data with other people, organizations, apps, and
services in a way that respects and protects a user's privacy. In our
decentralized platform, all user information & data resides in the user's
identity Hub. This section outlines the current proposal for identity hub
authorization.
        </p>

        <section>
          <h4>
Scope of the current design
          </h4>
          <p>
This proposal is a first cut. The intention is to start extremely simple, and
extend the model to include more richness over time. We choose to focus on two
simple use cases, described below.
          </p>

          <section>
            <h5>
Use case 1: Registering for a website
            </h5>

            <blockquote>
Alice has added some useful data about her wardrobe style to her Hub: her
measurements from her tailor, and a list of her favorite clothing brands. When
Alice goes to try out a new online clothing retailer, the retailer's website
allows her to set up an account using her DID. After signing in her DID, the
retailer's website is able to access Alice's style data. Alice does not have to
re-enter her sizes in the site, and the site can give her recommended options
based on her brand preferences.
            </blockquote>

            <figure>
              <img src="diagrams/permissions-use-case-1.png" />
              <figcaption>
Permission request flow
              </figcaption>
            </figure>
          </section>

          <section>
            <h5>
Use case 2: Reviewing & managing access
            </h5>
            <blockquote>
Alice learns that one of the websites she visited is making improper use of her
personal data. She would like to immediately remove that website's access to her
Hub.
            </blockquote>

            <figure>
              <img src="diagrams/permissions-use-case-2.png" />
              <figcaption>
Permission denied flow
              </figcaption>
            </figure>
          </section>

          <section>
            <h5>
Out of scope
            </h5>
            <p>
These use cases, and the current Hub authorization system are not sufficient to
consider identity Hubs ready for real world usage. It leaves out several
features that have been discussed as being necessary for a minimally viable
authorization layer, including:
            </p>
            <p>
<strong>Features that control <em>what</em> is being granted:</strong>
            </p>
            <ul>
              <li>
How to grant a permission to a specific object by ID, rather than all objects of
a certain type.
              </li>
              <li>
How to grant a permission to a property of some object type, rather than the
entire object.
              </li>
              <li>
How to grant permission to an object type and all of the children object types
in its respective schema.
              </li>
              <li>
How to filter a permission to only:
                <ul>
                  <li>
objects created by a specific DID.
                  </li>
                  <li>
objects created in a certain time period.
                  </li>
                  <li>
objects larger than some byte size.
                  </li>
                </ul>
              </li>
              <li>
How to grant a permission to a zero-knowledge proof of some object, rather than the entire object.
              </li>
              <li>
How to grant permission to act as a delegate of a DID when interacting with other Hubs.
              </li>
            </ul>

            <p>
<strong>Features that control <em>who</em> is being granted access:</strong>
            </p>
            <ul>
              <li>
How to grant a permission to all DIDs, and therefore make some data public.
              </li>
              <li>
How to create a permission that explicitly denies a DID access to an object.
              </li>
            </ul>

            <p>
<strong>Features that limit/expand <em>where or when</em> access is granted:</strong>
            </p>
            <ul>
              <li>
How to time-bound permissions, such that a permission expires automatically.
              </li>
              <li>
How to grant permissions to an app on some devices, but not others.
              </li>
            </ul>

            <p>
<strong>Features that control <em>why</em> access is granted:</strong>
            </p>
            <ul>
              <li>
How an app can specify why permission is being requested.
              </li>
              <li>
How a user can specify why permission is being denied.
              </li>
              <li>
How relying parties and trust providers are reviewed for trustworthiness and integrity.
              </li>
            </ul>

            <p>
<strong>Features that are related to Hub authorization, but will be addressed at a later time:</strong>
            </p>
            <ul>
              <li>
How to request & send callbacks to notify apps of changes to data and
permissions in a Hub.
              </li>
              <li>
How to authorize the execution of services, or extensions, in a Hub.
              </li>
              <li>
What format(s) the Hub uses for requests & responses.
              </li>
              <li>
How to encrypt data in a Hub such that the Hub provider cannot access it.
              </li>
            </ul>

            <p>
Clearly, there is a large body of functionality that can be added to Hub
authorization over time. This is why this initial document intentionally strives
to be as simple as possible. We'll incorporate these things into Hub
authorization over time as we receive feedback from early adopters of Identity
Hubs.
            </p>
          </section>
        </section>

        <section>
          <h4>
Authorization Model
          </h4>
          <p>
Access to data in Identity Hubs is controlled by a special object stored in Hubs
called a `PermissionGrant`. The structure of a `PermissionGrant` is:
          </p>

          <pre class="example">
{
  "owner": "did:example:12345", // the identity owner (granters)'s DID
  "grantee": "did:example:67890", // the grantee's DID
  "context": "schemas.clothing.org", // the data schema context
  "type": "measurements", // the data type
  "allow": "-R--", // allows create, read, update, and delete
  ... // additional richness & specificity can be added in the future
}
          </pre>

          <section>
            <h5>
Granting permissions
            </h5>
            <p>
When a hub owner grants a permission to another DID, they can do so by
specifying the exact objects in the permission grant. When permissions span more
than one data type, several PermissionGrant objects can be created. For each
PermissionGrant, the following object should be written to the `Permissions`
interface of the owner's Hub, typically via user agent:
            </p>
            <pre class="example">
{
  "@context": "schema.identity.foundation/Hub/",
  "@type": "PermissionGrant",
  "owner": "did:example:12345",
  "grantee": "did:example:67890",
  "context": "schemas.clothing.org",
  "type": "measurements",
  "allow": "-R--"
}
            </pre>
            <p>
Note that the Hub Permissions interface only supports the single PermissionGrant
object type. The Hub should reject any requests to create objects of other types
in the Permissions interface, barring future updates to the PermissionGrant
model.
            </p>
            <p>
The response format, and any error conditions, should be consistent with all
other requests to Hubs. Upon creation of this permission grant object in a
user's Hub, the permission will be propagated to all other Hub instances listed
in the user's DID document via the Hub's standard sync & replication protocol.
This will ensure that all Hub instances are up-to-date with all new permission
grants in a timely manner.
            </p>
          </section>
          <section>
            <h5>
Checking permissions
            </h5>
            <p>
The following describes the logic implemented by the Hub's authorization layer
when a request arrives.
            </p>

            <figure>
              <img src="diagrams/permissions-verification.png"/>
              <figcaption>
              </figcaption>
            </figure>

            <ol>
              <li>
Receive incoming request from client
              </li>
              <li>
Determine relevant schema, verb, and client from request
              </li>
              <li>
Query for all PermissionGrants that whose object_type matches the schema, for
the given client DID
              </li>
              <li>
Check if any query results allow the verb in question
              </li>
              <li>
Return success/failed authorization check
              </li>
            </ol>
            <p>
Note that PermissionGrants do not understand or evaluate the structure of a
given schema. For instance, if a user grants access to all
"https://schema.org/game" objects, they <strong>do not</strong> implicitly grant
access to all "https://schema.org/videogame" objects (which is a child of game
in schema.org's hierarchy).
            </p>
          </section>
          <section>
            <h5>
Reviewing & managing permissions
            </h5>

            <p>
`PermissionGrant` objects can be created, read, modified, and deleted just like
any other object in a hub. To revoke access to data, the Hub owner needs to
simply modify an existing `PermissionGrant` or delete it entirely. Instructions
for reading and writing data in Identity Hubs is available in
<a href="#api"></a>.
            </p>
          </section>
          <section>
            <h5>
Requesting permissions
            </h5>
            <p>
At this time, proposals for how to request access to data in an identity hub via
a user agent are still being evaluated. In the future, we will update this
document with details on how a client can request access from a user.
            </p>
          </section>
        </section>

      </section>
      <section>
        <h3>
API
        </h3>
        <p>
Because of the sensitive nature of the data being transmitted to Identity Hubs,
the Identity Hub request and response API may look a bit different to developers
who are used to a traditional REST service API. Most of the differences are
based on the high level of security and privacy decentralized identity
inherently demands.
        </p>
        <section>
          <h4>
Commits
          </h4>
          <p>
All data in identity hubs is represented as a series of "commits". A commit is
similar to a git commit; it represents a change to an object. To write data to
an identity hub, you need to construct and send a new commit to the hub. To read
data from an identity hub, you need to fetch all commits from the hub. An
object's current value can be constructed by applying all its commits in order.
          </p>
          <p>
The use of commits to represent data in identity hubs offers a few distinct
advantages:
          </p>
          <ul>
            <li>
it facilitates the hub's replication protocol, enabling multiple hub instances
to sync data.
            </li>
            <li>
it creates an auditable history of changes to an object, especially when each
commit is signed by a DID.
            </li>
            <li>
it eases implementation for use cases that need offline data modification and
require conflict resolution.
            </li>
          </ul>
          <p>
Each commit in a hub is a
<a href="https://en.wikipedia.org/wiki/JSON_Web_Token">JWT</a> whose body
contains the data to be written to the hub. Here's an example of a deserialized
and decoded JWT:
          </p>

          <pre class="example">
// JWT headers
{
  "alg": "RS256",
  "kid": "did:foo:123abc#key-abc",
  "interface": "Collections",
  "context": "https://schema.org",
  "type": "MusicPlaylist",
  "operation": "create",
  "committed_at": "2018-10-24T18:39:10.10:00Z",
  "commit_strategy": "basic",
  "sub": "did:bar:456def",

// Example metadata about the object that is intended to be "public"
  "meta": {
    "tags": ["classic rock", "rock", "rock n roll"],
    "cache-intent": "full"
  }
}

// JWT body
{
  "@context": "http://schema.org/",
  "@type": "MusicPlaylist",
  "description": "The best rock of the 60s, 70s, and 80s",
  "tracks": ["..."],
}

// JWT signature
uQRqsaky-SeP3m9QPZmTGtRtMoKzyg6wwWF...
          </pre>
          <p>
The commit is signed by the committer writing the data, in this case
<code>did:foo:123abc</code>. To write the commit into a hub, the committer must
send a Hub write request.
          </p>
        </section>
        <section>
          <h4>
Write Request &amp; Response Format
          </h4>
          <p>
Instead of a REST-based scheme where data like the username, object types, and
query strings are present in the URL, Identity Hubs requests are self-contained
message objects that encapsulate all they need to be shielded from observing
entities during transport.
          </p>
          <p>
Each Hub request is a JSON object which is then signed and encrypted as outlined
in the authentication section. The outer envelope is signed with the key of the
"iss" DID, and encrypted with the Hub's DID key(s).
          </p>

          <pre class="example">
{
  iss: 'did:foo:123abc',
  sub: 'did:bar:456def',
  aud: 'did:baz:789ghi',
  "@context": "https://schema.identity.foundation/0.1",
  '@type': 'WriteRequest',

  // The commit in JSON Serialization Format
  // See: https://tools.ietf.org/html/rfc7515#section-3.1
  commit: {
    protected: "ewogICJpbnRlcmZhY2...",

// Optional metadata information not protected by the JWT signature
header: {
  "iss": "did:foo:123abc"
},

payload: "ewogICJAY29udGV4dCI6...",
signature: "b7V2UpDPytr-kMnM_YjiQ3E0J2..."
  }
}
          </pre>
          <p>
Each response is also a JSON object, signed and encrypted in the same way as the request. Its contents are:
          </p>
          <pre class="example">
{
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "WriteResponse",
  "developer_message": "completely optional message from the hub",
  "revisions": ["aHashOfTheCommitSubmitted"]
}
          </pre>
        </section>
        <section>
          <h4>
Object Read Request &amp; Response Format
          </h4>
          <p>
Objects follow one logical object across multiple commits. Object reads do not
contain the literal object data, only metadata associated. Objects may be
queried for using the following request format:
          </p>
          <pre class="example">
{
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "ObjectQueryRequest",
  "iss": "did:foo:123abc",
  "sub": "did:bar:456def",
  "aud": "did:baz:789ghi",
  "query": {
      "interface": "Collections",
      "context": "http://schema.org",
      "type": "MusicPlaylist",

  // Optional object_id filters
  "object_id": ["3a9de008f526d239..", "a8f3e7..."]
  }
}
          </pre>
          <p>
The response to a query for objects returns a list of object IDs along with the
object metadata. The format is:
          </p>
          <pre class="example">
{
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "ObjectQueryResponse",
  "developer_message": "completely optional",
  "objects": [
    {
      // object metadata
      "interface": "Collections",
      "context": "http://schema.org",
      "type": "MusicPlaylist",
      "id": "3a9de008f526d239...",
      "created_by": "did:foo:123abc",
      "created_at": "2018-10-24T18:39:10.10:00Z",
      "sub": "did:foo:123abc",
      "commit_strategy": "basic",
      "meta": {
        "tags": ["classic rock", "rock", "rock n roll"],
        "cache-intent": "full"
      }
    },
    // ...more objects
  ]

// potential pagination token
  "skip_token": "ajfl43241nnn1p;u9390",
}
          </pre>
        </section>
        <section>
          <h4>
Commit Read Request &amp; Response Format
          </h4>
          <p>
To get the actual data in an object, you must read the commits from the Identity
Hub:
          </p>
          <pre class="example">
{
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "CommitQueryRequest",
  "iss": "did:foo:123abc",
  "sub": "did:bar:456def",
  "aud": "did:baz:789ghi",
  "query": {
    "object_id": ["3a9de008f526d239..."],
    "revision": ["abc", "def", ...]
  },
}
          </pre>
          <p>
A response to a query for commits contains a list of commit JWTs:
          </p>
          <pre class="example">
{
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "CommitQueryResponse",
  "developer_message": "completely optional",
  "commits": [
    {
      protected: "ewogICJpbnRlcmZhY2UiO...",
      header: {
        "iss": "did:foo:123abc",
        // Hubs may add additional information to the unprotected headers for convenience
        "rev": "aHashOfTheCommit",
      },
      payload: "ewogICJAY29udGV4dCI6ICdo...",
      signature: "b7V2UpDPytr-kMnM_YjiQ3E0J2..."
    },
    // ...
  ],

// potential pagination token
  "skip_token": "ajfl43241nnn1p;u9390",
}
          </pre>
        </section>

        <section>
          <h3>
Paging
          </h3>
          <p>
<code>skip_token</code> is an opaque token to be used for continuation of a
request.
          </p>
          <p>
They may be returned on responses with multiple results, and added to the
initial request's <code>query</code> object:
          </p>
          <pre class="example">
{
  "iss": "did:foo:123abc",
  "sub": "did:bar:456def",
  "aud": "did:baz:789ghi",
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "ObjectQueryRequest",
  "interface": "Collections",
  "query": {
    "context": "schema.org",
    "type": "MusicPlaylist",
    "skip_token": "ajfl43241nnn1p;u9390"
  }
}
          </pre>
        </section>
      </section>
      <section>
        <h3>
Interfaces
        </h3>
        <p>
To facilitate common interactions and data storage, Hubs provide a set of standard interfaces that can be written to:
        </p>
        <ul>
          <li>
<code>Profile</code> ➜ The owning entity's primary descriptor object (schema
agnostic)
          </li>
          <li>
<code>Permissions</code> ➜ The access control JSON document
          </li>
          <li>
<code>Actions</code> ➜ A known endpoint for the relay of actions to the identity
owner
          </li>
          <li>
<code>Stores</code> ➜ Scoped 1:1 storage space that is directly assigned to
another, external DID
          </li>
          <li>
<code>Collections</code> ➜ The owning entity's identity collections (access
limited)
          </li>
          <li>
<code>Services</code> ➜ any custom, service-based functionality the identity
exposes
          </li>
        </ul></p>

        <section>
          <h4>
Profile
          </h4>
          <p>
Each Hub has a <code>profile</code> object that describes the owning entity. The
profile object should use whatever schema and object that best represents the
entity. To ge the profile for a DID, send an object query to the
<code>Profile</code> interface:
          </p>
          <pre class="example">
{
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "ObjectQueryRequest",
  "iss": "did:foo:123abc",
  "sub": "did:bar:456def",
  "aud": "did:baz:789ghi",
  "query": {
      "interface": "Profile",
  }
}
          </pre>
        </section>
        <section>
          <h4>
Permissions
          </h4>
          <p>
All access and manipulation of identity data is subject to the permissions
established by the owning entity. See <a href="#authorization"></a> explainer
for details.
          </p>
        </section>
        <section>
          <h4>
Actions
          </h4>
          <p>
The <code>Actions</code> interface is for sending a target identity semantically
meaningful objects that convey an intent to the sender, which often involves the
data payload of the object. The <code>Actions</code> interface is not
constrained to simple human-centric communications. Rather, it is intended as a
universal conduit through which identities can transact all manner of
activities, exchanges, and notifications.
          </p>
          <p>
The base data format for conveying an action shall be:
<a href="http://schema.org/Action">http://schema.org/Action</a>
          </p>
          <p>
Here is a list of examples to show the range of use-cases this interface is
intended to support:
          </p>
          <ul>
            <li>
Human user contacts another with a textual message
(<a href="http://schema.org/ReadAction">ReadAction</a>)
            </li>
            <li>
Event app sends a request to RSVP for an event
(<a href="http://schema.org/RsvpAction">RsvpAction</a>)
            </li>
            <li>
Voting agency prompts a user to submit a vote
(<a href="http://schema.org/VoteAction">VoteAction</a>)
            </li>
          </ul>
          <pre class="example">
{
  "@context": "http://schema.org/",
  "@type": "ReadAction",
  "name": "Acme Bank - March 2018 Statement",
  "description": "Your Acme Bank statement for account #1734765",
  "object": PDF_SOURCE
}
          </pre>
        </section>
        <section>
          <h4>
Stores
          </h4>
          <p>
The best way to describe Stores is as a 1:1 DID-scoped variant of the W3C DOM's
origin-scoped <code>window.localStorage</code> API. The key difference being
that this form of persistent, pairwise object storage transcends providers,
platforms, and devices. For each storage relationship between the DID owner and
external DIDs, the Hub shall create a key-value document-based storage area. The
DID owner or external DID can store unstructured JSON data to the document, in
relation to the keys they specify. The Hub implementer may choose to limit the
available space of the storage document, with the option to expand the storage
limit based on criteria the implementer defines.
          </p>
        </section>
        <section>
          <h4>
Collections
          </h4>
          <p>
Data discovery has been a problem since the inception of the Web. Most previous
attempts to solve this begin with the premise that discovery is about individual
entities providing a mapping of their own service-specific API and data schemas.
While you can certainly create a common format for expressing different APIs and
data schemas, you are left with the same basic issue: a sea of services that
can't efficiently interoperate without specific review, effort, and integration.
Hubs avoid this issue entirely by recognizing that the problem with <em>data
discovery</em> is that it relies on <em>discovery</em>. Instead, Hubs assert the
position that locating and retrieving data should be an <em>implicitly
knowable</em> process.
          </p>
          <p>
Collections provide an interface for accessing data objects across all Hubs,
regardless of their implementation. This interface exerts almost no opinion on
what data schemas entities use. To do this, the Hub Collection interface allows
objects from any schema to be stored, indexed, and accessed in a unified manner.
          </p>
          <p>
With Collections, you store, query, and retrieve data based on the very schema
and type of data you seek. Here are a few example data objects from a variety of
common schemas that entities may write and access via a user's Hub:
          </p>
          <p>
<strong>Locate any offers a user might want to share with apps</strong> (http://schema.org/Offer)
          </p>
          <pre class="example">
{
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "ObjectQueryRequest",
  "iss": "did:foo:123abc",
  "sub": "did:bar:456def",
  "aud": "did:baz:789ghi",
  "query": {
      "interface": "Collections",
      "context": "http://schema.org",
      "type": "Offer",
  }
}
          </pre>
        </section>
        <section>
          <h4>
 Services
          </h4>
          <p>
 Services offer a means to surface custom service calls an identity wishes to
 expose publicly or in an access-limited fashion. Services should not require
 the Hub host to directly execute code the service calls describe; service
 descriptions should link to a URI where execution takes place.
          </p>
          <p>
Performing a <code>Request</code> request to the base <code>Services</code>
interface will respond with an object that contains an entry for every service
description the requesting entity is permitted to access.
          </p>
          <pre class="example">
// request
{
  "iss": "did:foo:123abc",
  "sub": "did:bar:456def",
  "aud": "did:baz:789ghi",
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "ServicesRequest",
}

// response
{
  "@context": "https://schema.identity.foundation/0.1",
  "@type": "ServicesResponse",
  developer_message: "optional message",
  services: [{
    // Open API service descriptors
  }]
}
          </pre>
          <p>
All definitions shall conform to the <a href="https://github.com/OAI/OpenAPI-Specification">Open API descriptor format</a>.
          </p></p>
      </section>
    </section>

  </body>
</html>