Skip to content

Object Update Commands (OLAP)

JoeWinter edited this page Feb 19, 2015 · 2 revisions

[Table of Contents](https://github.com/dell-oss/Doradus/wiki/OLAP Databases: Table-of-Contents) | Previous | Next
OLAP REST Commands: Object Update Commands


This section describes REST commands for adding, updating, and deleting objects in OLAP applications. Doradus uses idempotent update semantics, which means repeating an update is a no-op. If a REST update command fails due to a network failure or similar error, it is safe to perform the same command again.

Add Batch

A batch of new, updated, and/or deleted objects is loaded into a specific shard of an application using the following REST command:

POST /{application}/{shard}[?Overwrite={true|false}]

where {application} is the application name and {shard} is the shard to which the batch is to be added. Shard names are text strings and are not predefined: a shard is started when the first batch is added to it.

The optional Overwrite parameter (case-insensitive) indicates whether or not the batch should replace existing field values. The default is true, which means any field that already has a value for the corresponding object replaces the existing value. If multiple batches are added to the same shard with Overwrite=true, the last value added when the shard is merged takes precedence. If Overwrite is set to false, values in the corresponding batch are only additive: they never replace existing field values when the shard is merged.

The Add Batch command must include an input entity that contains the objects to be added, updated, and/or deleted. The format of an example input message in XML as shown below:

<batch>
	<docs>
		<doc _table="Message">
			<field name="_ID">92XJeDwQ8lD3/RS4yM5gTg==</field>
			<field name="Size">10334</field>
			<field name="Tags">
				<add>
					<value>Confidential</value>
					<value>Sensitive</value>
				</add>
			</field>
			...
		</doc>
		<doc _table="Message" _deleted="true">
			<field name="_ID">95lfrCiljbiKOQK9UH7LYg==</field>
		</doc>
		<doc _table="Person">
			<field name="_ID">x3OKbjCmKw47wEHaqV0nLQ==</field>
			<field name="FirstName">John</field>
			<field name="Manager">
				<add>
					<value>LjJEtDcwp1ltqJWJ980+HQ==</value>
				</add>
			</field>
			...
		</doc>
		...
	</docs>
</batch>

In JSON:

{"batch": {
	"docs": [
		{"doc": {
			"_table": "Message",
			"_ID": "92XJeDwQ8lD3/RS4yM5gTg==",
			"Size": "10334",
			"Tags": {
			"add": ["Confidential", "Sensitive"]
			},
			...
		}},
		{"doc": {
			"_table": "Message",
			"_deleted": "true",
			"_ID": "95lfrCiljbiKOQK9UH7LYg=="
		}},
		{"doc": {
			"_table": "Person",
			"_ID": "x3OKbjCmKw47wEHaqV0nLQ==",
			"FirstName": "John",
			"Manager": {
			"add": ["LjJEtDcwp1ltqJWJ980+HQ=="]
			},
			...
		}},
		...
	]
}}

As shown, messages from multiple tables can be mixed in the same batch. The _table property identifies the table that the object will be inserted to, updated in, or deleted from. An object is added the first time fields are assigned to its _ID. Assigning an SV scalar field for an existing object replaces its current value. MV scalar and link field values are added in add groups. An object is deleted by giving its _ID value and setting the system field _deleted to true.

When an object is added, the _ID field can be omitted, in which case Doradus assigns a unique ID. The ID is a base 64-encoded 120-bit value that is generated in a way to ensure uniqueness even in a multi-node cluster. However, automatic IDs remove the idempotency of updates: if the same object is added twice with the _ID field assigned, it will be inserted twice with different ID values.

The only restrictions on Doradus OLAP updates are:

  • Once assigned a value, a field cannot be set to null.

  • There is no way to remove values from an MV scalar or link field. (However, deleting an object automatically updates affected inverse links.)

Batches are persisted but not processed until the containing shard is merged. When the shard is merged, all adds, updates, and deletes are merged with existing objects in the shard. If the same object is updated in multiple batches, the updates are merged; conflicts, such setting the same SV scalar field to different values, are resolved by using the update in the most recently-added batch.

The ideal batch size is application-specific and depends on several factors:

  • If the number of objects per batch is too large or too small, merging the batches into a single segment will take longer, slowing down the overall load time.

  • Because an object batch temporarily resides in memory, both the client and the Doradus server require memory proportional to the size of the batch. Using REST API compression helps with server memory because object batches are parsed and loaded from the compressed message entity.

Delete Batch

Objects can be deleted in the Add Batch command, but they can also be deleted en masse with the following REST command:

DELETE /{application}/{shard}

where {application} is the application name and {shard} is the shard from which objects are to be deleted. The command must include an input entity that only contains the _table and _ID of each object to be deleted. Example:

<batch>
	<docs>
		<doc _table="Message">
			<field name="_ID">92XJeDwQ8lD3/RS4yM5gTg==</field>
		</doc>
		<doc _table="Message">
			<field name="_ID">95lfrCiljbiKOQK9UH7LYg==</field>
		</doc>
		<doc _table="Person">
			<field name="_ID">x3OKbjCmKw47wEHaqV0nLQ==</field>
		</doc>
		...
	</docs>
</batch>

In JSON:

{"batch": {
	"docs": [
		{"doc": {
			"_table": "Message",
			"_ID": "92XJeDwQ8lD3/RS4yM5gTg=="
		}},
		{"doc": {
			"_table": "Message",
			"_ID": "95lfrCiljbiKOQK9UH7LYg=="
		}},
		{"doc": {
			"_table": "Person",
			"_ID": "x3OKbjCmKw47wEHaqV0nLQ=="
		}},
		...
	]
}}

Only the _table and _ID of each doc element is required. If any other fields are assigned values, they are ignored. As with the Add Batch command, the delete batch is stored but not processed until the corresponding shard is merged.

Clone this wiki locally