Lovefield 101: Fundamentals

Lovefield is a relational database written in JavaScript and can be executed on browsers. Relational database is a huge topic and it's not my intention to lecture what it is. For those who do not understand what relational database is and how to use it, Stanford University has generously offered their course materials online:

CS145: Data Management and Data Systems

In this document, it is assumed the readers know the fundamentals of relational databases.

Limitations of Lovefield

Technical limitations

Different database systems are designed for different usages. Lovefield is constrained by available browser heaps and therefore one can assume that Lovefield can operate on less than 2GB of memory (when the browser process will become extremely slow). A safer bet is the database size should be around or below 500MB.

During initialization, Lovefield loads everything from persistent store into memory because there is not a reliable memory swap provided by HTML5/Javascript and the only reliable lock is JavaScript's single-main-thread policy.

Knowing these two technical constraints are crucial for the usage of Lovefield library.

Artificial limitations

All naming in Lovefield must pass the following JavaScript regex check:

/^[A-Za-z_][A-Za-z0-9_]*$/.test(name)

Names that violates this check will result in exceptions.

Schema and builder pattern

Lovefield uses builder pattern to create schema. As a result, every database starts with requesting a schema builder:

const builder = lf.schema.create(
    /* database name */ 'HR',
    /* database version */ 1
);

All database instances are identified by the combination of origin and database name. The version is used to determine whether to trigger a database upgrade or not since Lovefield assumes the schema is the same if the version number of a given database does not change.

One can use the builder to define table schema:

builder.createTable('Asset')
    .addColumn('id', lf.Type.STRING)
    .addColumn('asset', lf.Type.STRING)
    .addColumn('owner', lf.Type.STRING)
    .addColumn('acquisitionDate', lf.Type.DATE_TIME)
    .addColumn('tagId', lf.Type.STRING)
    .addPrimaryKey(['id'])
    .addNullable(['owner'])
    .addUnique(['tagId']);

Column types

A table schema contains column(s) and/or index(es) and constraint(s). A table contains at least one column, which is added using addColumn() API. Each column must associate with a data type:

Type	Default value	Nullable by default	Description
`Type.ARRAY_BUFFER`	`null`	Yes	JavaScript `ArrayBuffer` object
`Type.BOOLEAN`	`false`	No	JavaScript `boolean`
`Type.DATE_TIME`	`Date(0)`	No	JavaScript `Date` object
`Type.INTEGER`	`0`	No	JavaScript `number` assuming to be integer
`Type.NUMBER`	`0`	No	JavaScript `number` assuming to be decimals
`Type.STRING`	`''`	No	JavaScript `string`
`Type.OBJECT`	`null`	Yes	JavaScript `Object` stored as-is

Nullable means a column can be null. In Lovefield, JavaScript null is used to represent NULL value in RDBMS.

You are not supposed to pass undefined or null for not null columns. Doing so will cause undefined behavior and/or errors.

Index and constraints

Lovefield supports the following constraints

Primary key (also serves as index)
Unique key (also serves as index)
Foreign key (indirectly serves as index)
Nullable / Not-nullable

Primary key, unique key, and indices can be added to a table via addPrimaryKey(), addUnique(), addIndex() accordingly. They all accept an array of column names, which means these columns in combination forms the index. For example:

builder.createTable('Asset')
    ...
    .addPrimaryKey(['id']);   // column |id| is the *only* key.

builder.createTable('AssetTag')
    ...
    // column pair (|tagId|, |assetId|) forms the primary key.
    .addPrimaryKey(['tagId', 'assetId']);

addNullable() also accepts an array of column names, but it means every column in this list can be nullable. They do not form indices.

We'll talk more about use of indices in later section.

Connection and database instances

Lovefield is designed to persist data within the browser client. By default, IndexedDB is used as Lovefield's data store. Why do we need Lovefield when IndexedDB is supposed to work by itself? That's another story worth a separate ~~rant~~ article about it, and we'll leave the spicy contents there.

Each database instance (i.e. persisted data within the browser client) is identified by the tuple of (origin, database name, database version). The instance is accessed via the connect() function.

If there were no instances found in the browser, Lovefield will create a new instance for you. If there were matching instance found, Lovefield will load that instance into memory so that you can resume using it from the last session.

If you do not wish to persist your database within the browser storage, you can instruct Lovefield to operate in memory only mode:

const db = await builder.connect({
  storeType: lf.DataStore.MEMORY
});

If you wish all database connections in your session to be in-memory only, you can also use the options API to do so:

lf.options.set({memoryOnly: true});

Note: in node.js, only in-memory is supported and is already enforced in the node.js module offered.

When connect() is called, it will return a connection object that can be used to run queries. At this moment the schema set in builder can no longer change.

Transaction and queries

A query in relational database means a request to manipulate data or schema, it can be insert/delete/update data, or retrieval of data (which is usually named as select). All queries in Lovefield is executed in the context of a transaction. A transaction is an atomic unit of execution: all queries inside a transaction either succeed together (called commit), or fail together (called rollback). If the transaction failed, data remained in their original state untouched.

Insert and update

All queries starts with inserting data into the database. To insert data, one needs to create rows first:

const item = db.getSchema().table('Item');  // get the |item| table schema

// Use table schema to create row
const row = item.createRow({
  id: 1234,
  description: 'a todo item',
  deadline: new Date(),
  done: false
});

// These two APIs are the few Lovefield synchronous API, they return concrete
// objects, not promises.

Once the rows are created, you can use one of the insert() queries to store them into database:

// This writes the |row| into the |item| table.
// If there were an existing row with the same primary key, it will raise
// an exception.
await db.insert().into(item).values([row]).exec();

// This writes the |row| into the |item| table.
// If there were an existing row with the same primary key, it will replace
// that one with this |row|.
await db.insertOrReplace().into(item).values([row]).exec();

// Queries are all promise-based APIs. So either handle the promise, or use
// await syntax.

You can also just update some fields of an existing row without the need to provide a full row:

// Change the done field to true for todo item 1234.
await db.update(item).set(item.done, true).where(item.id.eq(1234)).exec();

// TypeScript users: use col() to avoid annoying typing errors.
const id = item.col('id');
const done = item.col('done');
await db.update(item).set(done, true).where(id.eq(1234)).exec();

// We'll just use col() for the rest of the examples.

Search conditions

The where() function accepts predicates, also known as search conditions, which are conditions for Lovefield to match the rows. To avoid free text parsing and offer better security, Lovefield uses functions to define the search conditions:

Function	Name	Example	SQL Equivalent
`eq`	equals	`column.eq(value)`	`column = value`
`neq`	not equals	`column.neq(value)`	`column <> value`
`lt`	less than	`column.lt(value)`	`column < value`
`lte`	less than or equals to	`column.lte(value)`	`column <= value`
`gt`	greater than	`column.gt(value)`	`column > value`
`gte`	greater than or equals to	`column.gte(value)`	`column >= value`
`match`	regex match	`column.match(/\w*/)`	See Note below
`between`	between	`column.between(100, 200)`	`column BETWEEN 100 AND 200`
`in`	in the array	`column.in([1, 2, 3])`	`column IN (1, 2, 3)`
`isNull`	is null	`column.isNull()`	`column IS NULL`
`isNotNull`	is not null	`column.isNotNull()`	`column IS NOT NULL`

Note: match is similar to SQL SIMILAR, however, due to technical limitations, we can only offer JavaScript regex search instead. Please also note that match will force a full table scan, which means performance penalty.

Delete

You can of course delete the rows that you do not want, for example:

// Remove all done items from the database.
await delete().from(item).where(item.col('done').eq(true)).exec();

Once committed, the data is gone forever and there's no way to get it back.

You might wonder that previously we mentioned

All queries in Lovefield is executed in the context of a transaction.

And where is that transaction? When you call the exec() function, it creates an implicit transaction to run that single query. When the promise is resolved, the implicit transaction is committed.

Select

Select queries are the most used queries in a relational database. Most use cases will have at least 10x more select queries than all other queries combined. Select queries retrieve data from the database.

Lovefield provides the following features for select query:

Filtering via search conditions
Sorting
Paging
Grouping
Aggregation
Join

Filter just the fields needed (a.k.a. Projection)

You can filter the select results by specifying the fields needed instead of getting back the full row. For example:

// List the description of done items.
const descriptions = await db
    .select(item.col('description'))
    .from(item)
    .where(item.col('done').eq(true))
    .exec();

descriptions.forEach(desc => {
  // Returned value are object arrays, each object will have only one property
  // named 'description'.
  console.log(desc.description);
});

// Rename the returned object property to 'd'
const d2 = await db
    .select(item.col('description').as('d'))
    .from(item)
    .where(item.col('done').eq(true))
    .exec();

// Flatten returned value into a string array.
const res = d2.map(d => d['d']);

Lovefield always return an array of objects as the result of select() query, and by default the property names are the column names. The as() function offered in column schema object will allow renaming the returned property name into something else. This is especially useful for join, grouping, and aggregations in later sections.

Multiple search conditions

Search conditions is briefly mentioned previously. In select query, you can also use where() to specify the search conditions:

import * as lf from './node_modules/lovefield-ts/dist/es6/lf';

// List all to-do items.
const res = await db
    .select()
    .from(item)
    .where(item.col('done').eq(false))
    .exec();

// List all to-do items due today.
const start = new Date().setHours(0, 0, 0, 0);
const end = new Date().setHours(23, 59, 59, 999);
const res = await db
    .select()
    .from(item)
    .where(lf.op.and(
      item.col('done').eq(false),
      item.col('deadline').between(start, end)
    ))
    .exec();

There can be only one where() call in query, and thus to combine search conditions, you'll need one of the combining predicates to help:

Function	Example	SQL equivalent
`op.and`	`op.and(sc1, sc2)`	`sc1 AND sc2`
	`op.and(sc1, sc2, sc3)`	`sc1 AND sc2 AND sc3`
`op.or`	`op.or(sc1, sc2)`	`sc1 OR sc2`
	`op.or(sc1, sc2, sc3)`	`sc1 OR sc2 OR sc3`
`op.not`	`op.not(sc)`	`NOT sc`

Note: op.and() and op.or() accept two or more arguments. op.not() accepts only one argument.

Sorting

Lovefield provides sorting for select() query results. For example:

// Return TODO items sorted by their due date.
const res = await db
    .select()
    .from(item)
    .where(item.col('done').eq(false))
    .orderBy(item.col('deadline'))  // default to ascending, i.e. smaller first.
    .exec();

It's also possible to do advanced sorting, for example:

import * as lf from './node_modules/lovefield-ts/dist/es6/lf';

// Return TODO items sort by their due date first,
// then reverse alphabetical order of description.
const res = await db
    .select()
    .from(item)
    .where(item.col('done').eq(false))
    // specifically called out sorting orders for better code maintainability.
    .orderBy(item.col('deadline'), lf.Order.ASC)
    .orderBy(item.col('description'), lf.Order.DESC)
    .exec();

There can be multiple orderBy() calls within a select() query, and the final results will be sorted by the order of their appearance accordingly. One can specify the sorting order by using the constants provided:

Constant	Meaning
`Order.ASC`	ascending order, meaning smaller first, e.g. 1-9 A-Z
`Order.DESC`	descending order, meaning larger first, e.g. 9-1, Z-A

Paging

Paging of select() results are achieved by combining the limit() and skip() functions together:

const res = db
    .select()
    .from(item)
    .where(item.col('done').eq(false))
    .limit(10)  // return at most 10 items
    .skip(20)  // starting from the 21th elements if possible
    .exec();

This is very similar to SQL's LIMIT and SKIP syntax. Users to be aware of the transactions in play:

async function getPage(p) {
  return db
      .select()
      .from(item)
      .orderBy(item.col('deadline'))
      .limit(10)
      .skip(p * 10)
      .exec();
}

async function addData(r) {
  return db.insertOrReplace().into(item).values([r]).exec();
}

const res1 = await getPage(0);
await addData(r);
const res2 = await getPage(0);  // res2 may not be the same as res1!

If you changed the data in between select() calls, the results are picked at the snapshot of the time that select() call is invoked. If you want to keep a consistent snapshot, you'll need to use explicit transactions that will be detailed in Lovefield 102.

Aggregation

A lot of time we are only interested in the aggregated results of target rows, for example, number of rows, or a maximum value of a column in the target rows. Lovefield provides the following aggregation functions for you to use:

Function	SQL Equivalent	What is it?
`fn.avg(order.col('amount'))`	`AVG(order.amount)`	Arithmetic mean of all values in this column (`integer` and `number` types only)
`fn.count(order.col('id'))`	`COUNT(order.id)`	Number of rows matching this column (usually used in join query)
`fn.count()`	`COUNT(*)`	Number of rows of the associated table
`fn.distinct(order.col('type'))`	`DISTINCT order.type`	All unique values of this column
`fn.geomean(order.col('amount'))`	None	Geomean of all values in this column (`integer` and `number` types only)
`fn.max(order.col('amount'))`	`MAX order.amount`	Maximum value of this column, applicable to `integer`, `number`, `string`, and `datetime`
`fn.min(order.col('amount'))`	`MIN order.amount`	Minimum value of this column, applicable to `integer`, `number`, `string`, and `datetime`
`fn.stddev(order.col('amount'))`	`STDDEV order.amount`	Standard deviation of this column (`integer` and `number` types only)
`fn.sum(order.col('amount'))`	`SUM order.amount`	Summation of all values in this column (`integer` and `number` types only)

Here's a quick example showing its usage:

import * as lf from './node_modules/lovefield-ts/dist/es6/lf';

const order = db.getSchema().table('order');
const amt = order.col('amount');
const res = await db
    .select(
      lf.fn.count().as('# of orders'),
      lf.fn.avg(amt).as('average amount')
    )
    .from(order)
    .exec();

// res will be returned as an array with only one element like this:
// {
//   "# of orders": 345,
//   "average amount": 12345.67
// }

Grouping

With aggregators, sometimes we need separate groups to make the number even more meaningful. For example, we usually wanted to know number of orders and average amount per day or per customer. In this case, we can use grouping:

import * as lf from './node_modules/lovefield-ts/dist/es6/lf';

const order = db.getSchema().table('order');
const amt = order.col('amount');
const res = await db
    .select(
      order.col('customerId').as('cid'),
      lf.fn.count().as('cnt'),
      lf.fn.avg(amt).as('avg')
    )
    .from(order)
    .groupBy(order.col('customerId'))
    .orderBy(order.col('customerId'))
    .exec();

// res will be an array like this
// [
//   {cid: 1, cnt: 2, avg: 300.25},
//   {cid: 2, cnt: 4, avg: 172.25}
// ]

There are two important limitations of groupBy() that you need to keep in mind:

You can only groupBy() columns of type integer, number, string, datetime, and boolean (though groupBy boolean is NOT suggested)
You can only project the groupBy() column alongside the aggregated columns. Any other columns will cause Lovefield to throw an exception. For example:

import * as lf from './node_modules/lovefield-ts/dist/es6/lf';

// This will throw an error.
const res = await db
    .select(
      order.col('customerId').as('cid'),
      order.col('clerkId').as('agent'),  // ERROR!
      lf.fn.count().as('cnt'),
      lf.fn.avg(amt).as('avg')
    )
    .from(order)
    .groupBy(order.col('customerId'))

Parameterized query

Parameterized query are very common for RDBMS programming, and Lovefield supports it. The concept is to bind some variables inside query at runtime. For example:

import * as lf from './node_modules/lovefield-ts/dist/es6/lf';

// SQL equivalent:
// SELECT * FROM order where customerId = ?0 AND date BETWEEN ?1 AND ?2
//
// There is a runtime cost to create a query object and its execution plans.
const query = db
    .select()
    .from(order)
    .where(lf.op.and(
      order.col('customerId').eq(lf.bind(0)),
      order.col('date').between(lf.bind(1), lf.bind(2))
    ))
    .limit(lf.bind(3))
    .skip(lf.bind(4));

// We can use the same query object binding to different variables to avoid
// recreating execution plan. This is especially helpful to execute queries
// inside loops.
const res1 = await query
    .bind([
      /* 0 */ customer1,
      /* 1 */ dateStart1,
      /* 2 */ dateStart2,
      /* 3 */ 10,
      /* 4 */ 0])
    .exec();

The bind() function can also bind to a whole array, for example:

import * as lf from './node_modules/lovefield-ts/dist/es6/lf';

// The values expects an array.
const query = db.insert.into(items).values(lf.bind(0));
query.bind([[row]]).exec();  // Therefore we gave it an array as element 0.

So where exactly can I use bind()? The table below summarize it for you.

Query type	Parameterizable places
all	search conditions
`select()`	`limit()`, `skip()`
`insert()`, `insertOrReplace()`	`values()`
`update()`	`set()`

We have covered most of the fundamentals of using Lovefield so far. Lovefield actually offers more features than just these, which we will reveal them in the next session: Lovefield 102.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lf101.md

lf101.md

Lovefield 101: Fundamentals

Limitations of Lovefield

Technical limitations

Artificial limitations

Schema and builder pattern

Column types

Index and constraints

Connection and database instances

Transaction and queries

Insert and update

Search conditions

Delete

Select

Filter just the fields needed (a.k.a. Projection)

Multiple search conditions

Sorting

Paging

Aggregation

Grouping

Parameterized query

Files

lf101.md

Latest commit

History

lf101.md

File metadata and controls

Lovefield 101: Fundamentals

Limitations of Lovefield

Technical limitations

Artificial limitations

Schema and builder pattern

Column types

Index and constraints

Connection and database instances

Transaction and queries

Insert and update

Search conditions

Delete

Select

Filter just the fields needed (a.k.a. Projection)

Multiple search conditions

Sorting

Paging

Aggregation

Grouping

Parameterized query