Lovefield is a relational database written in JavaScript and can be executed on browsers. Relational database is a huge topic and it's not my intention to lecture what it is. For those who do not understand what relational database is and how to use it, Stanford University has generously offered their course materials online:
CS145: Data Management and Data Systems
In this document, it is assumed the readers know the fundamentals of relational databases.
Different database systems are designed for different usages. Lovefield is constrained by available browser heaps and therefore one can assume that Lovefield can operate on less than 2GB of memory (when the browser process will become extremely slow). A safer bet is the database size should be around or below 500MB.
During initialization, Lovefield loads everything from persistent store into memory because there is not a reliable memory swap provided by HTML5/Javascript and the only reliable lock is JavaScript's single-main-thread policy.
Knowing these two technical constraints are crucial for the usage of Lovefield library.
All naming in Lovefield must pass the following JavaScript regex check:
/^[A-Za-z_][A-Za-z0-9_]*$/.test(name)
Names that violates this check will result in exceptions.
Lovefield uses builder pattern to create schema. As a result, every database starts with requesting a schema builder:
const builder = lf.schema.create(
/* database name */ 'HR',
/* database version */ 1
);
All database instances are identified by the combination of origin and database name. The version is used to determine whether to trigger a database upgrade or not since Lovefield assumes the schema is the same if the version number of a given database does not change.
One can use the builder to define table schema:
builder.createTable('Asset')
.addColumn('id', lf.Type.STRING)
.addColumn('asset', lf.Type.STRING)
.addColumn('owner', lf.Type.STRING)
.addColumn('acquisitionDate', lf.Type.DATE_TIME)
.addColumn('tagId', lf.Type.STRING)
.addPrimaryKey(['id'])
.addNullable(['owner'])
.addUnique(['tagId']);
A table schema contains column(s) and/or index(es) and constraint(s). A
table contains at least one column, which is added using addColumn()
API. Each
column must associate with a data type:
Type | Default value | Nullable by default | Description |
---|---|---|---|
Type.ARRAY_BUFFER |
null |
Yes | JavaScript ArrayBuffer object |
Type.BOOLEAN |
false |
No | JavaScript boolean |
Type.DATE_TIME |
Date(0) |
No | JavaScript Date object |
Type.INTEGER |
0 |
No | JavaScript number assuming to be integer |
Type.NUMBER |
0 |
No | JavaScript number assuming to be decimals |
Type.STRING |
'' |
No | JavaScript string |
Type.OBJECT |
null |
Yes | JavaScript Object stored as-is |
Nullable means a column can be null
. In Lovefield, JavaScript null
is
used to represent NULL
value in RDBMS.
You are not supposed to pass
undefined
ornull
for not null columns. Doing so will cause undefined behavior and/or errors.
Lovefield supports the following constraints
- Primary key (also serves as index)
- Unique key (also serves as index)
- Foreign key (indirectly serves as index)
- Nullable / Not-nullable
Primary key, unique key, and indices can be added to a table via
addPrimaryKey()
, addUnique()
, addIndex()
accordingly. They all accept
an array of column names, which means these columns in combination forms the
index. For example:
builder.createTable('Asset')
...
.addPrimaryKey(['id']); // column |id| is the *only* key.
builder.createTable('AssetTag')
...
// column pair (|tagId|, |assetId|) forms the primary key.
.addPrimaryKey(['tagId', 'assetId']);
addNullable()
also accepts an array of column names, but it means every column
in this list can be nullable. They do not form indices.
We'll talk more about use of indices in later section.
Lovefield is designed to persist data within the browser client. By default,
IndexedDB is used as Lovefield's data store. Why do we need Lovefield when
IndexedDB is supposed to work by itself? That's another story worth a separate
rant article about it, and we'll leave the spicy contents there.
Each database instance (i.e. persisted data within the browser client) is
identified by the tuple of (origin, database name, database version)
. The
instance is accessed via the connect()
function.
If there were no instances found in the browser, Lovefield will create a new instance for you. If there were matching instance found, Lovefield will load that instance into memory so that you can resume using it from the last session.
If you do not wish to persist your database within the browser storage, you can instruct Lovefield to operate in memory only mode:
const db = await builder.connect({
storeType: lf.DataStore.MEMORY
});
If you wish all database connections in your session to be in-memory only, you can also use the options API to do so:
lf.options.set({memoryOnly: true});
Note: in node.js, only in-memory is supported and is already enforced in the node.js module offered.
When connect()
is called, it will return a connection object that can be used
to run queries. At this moment the schema set in builder can no longer
change.
A query in relational database means a request to manipulate data or schema, it can be insert/delete/update data, or retrieval of data (which is usually named as select). All queries in Lovefield is executed in the context of a transaction. A transaction is an atomic unit of execution: all queries inside a transaction either succeed together (called commit), or fail together (called rollback). If the transaction failed, data remained in their original state untouched.
All queries starts with inserting data into the database. To insert data, one needs to create rows first:
const item = db.getSchema().table('Item'); // get the |item| table schema
// Use table schema to create row
const row = item.createRow({
id: 1234,
description: 'a todo item',
deadline: new Date(),
done: false
});
// These two APIs are the few Lovefield synchronous API, they return concrete
// objects, not promises.
Once the rows are created, you can use one of the insert()
queries to store
them into database:
// This writes the |row| into the |item| table.
// If there were an existing row with the same primary key, it will raise
// an exception.
await db.insert().into(item).values([row]).exec();
// This writes the |row| into the |item| table.
// If there were an existing row with the same primary key, it will replace
// that one with this |row|.
await db.insertOrReplace().into(item).values([row]).exec();
// Queries are all promise-based APIs. So either handle the promise, or use
// await syntax.
You can also just update some fields of an existing row without the need to provide a full row:
// Change the done field to true for todo item 1234.
await db.update(item).set(item.done, true).where(item.id.eq(1234)).exec();
// TypeScript users: use col() to avoid annoying typing errors.
const id = item.col('id');
const done = item.col('done');
await db.update(item).set(done, true).where(id.eq(1234)).exec();
// We'll just use col() for the rest of the examples.
The where()
function accepts predicates, also known as search
conditions, which are conditions for Lovefield to match the rows. To avoid
free text parsing and offer better security, Lovefield uses functions to
define the search conditions:
Function | Name | Example | SQL Equivalent |
---|---|---|---|
eq |
equals | column.eq(value) |
column = value |
neq |
not equals | column.neq(value) |
column <> value |
lt |
less than | column.lt(value) |
column < value |
lte |
less than or equals to | column.lte(value) |
column <= value |
gt |
greater than | column.gt(value) |
column > value |
gte |
greater than or equals to | column.gte(value) |
column >= value |
match |
regex match | column.match(/\w*/) |
See Note below |
between |
between | column.between(100, 200) |
column BETWEEN 100 AND 200 |
in |
in the array | column.in([1, 2, 3]) |
column IN (1, 2, 3) |
isNull |
is null | column.isNull() |
column IS NULL |
isNotNull |
is not null | column.isNotNull() |
column IS NOT NULL |
Note:
match
is similar to SQLSIMILAR
, however, due to technical limitations, we can only offer JavaScript regex search instead. Please also note thatmatch
will force a full table scan, which means performance penalty.
You can of course delete the rows that you do not want, for example:
// Remove all done items from the database.
await delete().from(item).where(item.col('done').eq(true)).exec();
Once committed, the data is gone forever and there's no way to get it back.
You might wonder that previously we mentioned
All queries in Lovefield is executed in the context of a transaction.
And where is that transaction? When you call the exec()
function, it creates
an implicit transaction to run that single query. When the promise is
resolved, the implicit transaction is committed.
Select queries are the most used queries in a relational database. Most use cases will have at least 10x more select queries than all other queries combined. Select queries retrieve data from the database.
Lovefield provides the following features for select query:
- Filtering via search conditions
- Sorting
- Paging
- Grouping
- Aggregation
- Join
You can filter the select results by specifying the fields needed instead of getting back the full row. For example:
// List the description of done items.
const descriptions = await db
.select(item.col('description'))
.from(item)
.where(item.col('done').eq(true))
.exec();
descriptions.forEach(desc => {
// Returned value are object arrays, each object will have only one property
// named 'description'.
console.log(desc.description);
});
// Rename the returned object property to 'd'
const d2 = await db
.select(item.col('description').as('d'))
.from(item)
.where(item.col('done').eq(true))
.exec();
// Flatten returned value into a string array.
const res = d2.map(d => d['d']);
Lovefield always return an array of objects as the result of select()
query, and by default the property names are the column names. The as()
function offered in column schema object will allow renaming the returned
property name into something else. This is especially useful for join, grouping,
and aggregations in later sections.
Search conditions is briefly mentioned previously. In
select query, you can also use where()
to specify the search conditions:
import * as lf from './node_modules/lovefield-ts/dist/es6/lf';
// List all to-do items.
const res = await db
.select()
.from(item)
.where(item.col('done').eq(false))
.exec();
// List all to-do items due today.
const start = new Date().setHours(0, 0, 0, 0);
const end = new Date().setHours(23, 59, 59, 999);
const res = await db
.select()
.from(item)
.where(lf.op.and(
item.col('done').eq(false),
item.col('deadline').between(start, end)
))
.exec();
There can be only one where()
call in query, and thus to combine search
conditions, you'll need one of the combining predicates to help:
Function | Example | SQL equivalent |
---|---|---|
op.and |
op.and(sc1, sc2) |
sc1 AND sc2 |
op.and(sc1, sc2, sc3) |
sc1 AND sc2 AND sc3 |
|
op.or |
op.or(sc1, sc2) |
sc1 OR sc2 |
op.or(sc1, sc2, sc3) |
sc1 OR sc2 OR sc3 |
|
op.not |
op.not(sc) |
NOT sc |
Note:
op.and()
andop.or()
accept two or more arguments.op.not()
accepts only one argument.
Lovefield provides sorting for select()
query results. For example:
// Return TODO items sorted by their due date.
const res = await db
.select()
.from(item)
.where(item.col('done').eq(false))
.orderBy(item.col('deadline')) // default to ascending, i.e. smaller first.
.exec();
It's also possible to do advanced sorting, for example:
import * as lf from './node_modules/lovefield-ts/dist/es6/lf';
// Return TODO items sort by their due date first,
// then reverse alphabetical order of description.
const res = await db
.select()
.from(item)
.where(item.col('done').eq(false))
// specifically called out sorting orders for better code maintainability.
.orderBy(item.col('deadline'), lf.Order.ASC)
.orderBy(item.col('description'), lf.Order.DESC)
.exec();
There can be multiple orderBy()
calls within a select()
query, and the final
results will be sorted by the order of their appearance accordingly. One can
specify the sorting order by using the constants provided:
Constant | Meaning |
---|---|
Order.ASC |
ascending order, meaning smaller first, e.g. 1-9 A-Z |
Order.DESC |
descending order, meaning larger first, e.g. 9-1, Z-A |
Paging of select()
results are achieved by combining the limit()
and
skip()
functions together:
const res = db
.select()
.from(item)
.where(item.col('done').eq(false))
.limit(10) // return at most 10 items
.skip(20) // starting from the 21th elements if possible
.exec();
This is very similar to SQL's LIMIT
and SKIP
syntax. Users to be aware of
the transactions in play:
async function getPage(p) {
return db
.select()
.from(item)
.orderBy(item.col('deadline'))
.limit(10)
.skip(p * 10)
.exec();
}
async function addData(r) {
return db.insertOrReplace().into(item).values([r]).exec();
}
const res1 = await getPage(0);
await addData(r);
const res2 = await getPage(0); // res2 may not be the same as res1!
If you changed the data in between select()
calls, the results are picked at
the snapshot of the time that select()
call is invoked. If you want to keep
a consistent snapshot, you'll need to use explicit transactions that will be
detailed in Lovefield 102.
A lot of time we are only interested in the aggregated results of target rows, for example, number of rows, or a maximum value of a column in the target rows. Lovefield provides the following aggregation functions for you to use:
Function | SQL Equivalent | What is it? |
---|---|---|
fn.avg(order.col('amount')) |
AVG(order.amount) |
Arithmetic mean of all values in this column (integer and number types only) |
fn.count(order.col('id')) |
COUNT(order.id) |
Number of rows matching this column (usually used in join query) |
fn.count() |
COUNT(*) |
Number of rows of the associated table |
fn.distinct(order.col('type')) |
DISTINCT order.type |
All unique values of this column |
fn.geomean(order.col('amount')) |
None | Geomean of all values in this column (integer and number types only) |
fn.max(order.col('amount')) |
MAX order.amount |
Maximum value of this column, applicable to integer , number , string , and datetime |
fn.min(order.col('amount')) |
MIN order.amount |
Minimum value of this column, applicable to integer , number , string , and datetime |
fn.stddev(order.col('amount')) |
STDDEV order.amount |
Standard deviation of this column (integer and number types only) |
fn.sum(order.col('amount')) |
SUM order.amount |
Summation of all values in this column (integer and number types only) |
Here's a quick example showing its usage:
import * as lf from './node_modules/lovefield-ts/dist/es6/lf';
const order = db.getSchema().table('order');
const amt = order.col('amount');
const res = await db
.select(
lf.fn.count().as('# of orders'),
lf.fn.avg(amt).as('average amount')
)
.from(order)
.exec();
// res will be returned as an array with only one element like this:
// {
// "# of orders": 345,
// "average amount": 12345.67
// }
With aggregators, sometimes we need separate groups to make the number even more meaningful. For example, we usually wanted to know number of orders and average amount per day or per customer. In this case, we can use grouping:
import * as lf from './node_modules/lovefield-ts/dist/es6/lf';
const order = db.getSchema().table('order');
const amt = order.col('amount');
const res = await db
.select(
order.col('customerId').as('cid'),
lf.fn.count().as('cnt'),
lf.fn.avg(amt).as('avg')
)
.from(order)
.groupBy(order.col('customerId'))
.orderBy(order.col('customerId'))
.exec();
// res will be an array like this
// [
// {cid: 1, cnt: 2, avg: 300.25},
// {cid: 2, cnt: 4, avg: 172.25}
// ]
There are two important limitations of groupBy()
that you need to keep in
mind:
- You can only
groupBy()
columns of typeinteger
,number
,string
,datetime
, andboolean
(though groupBy boolean is NOT suggested) - You can only project the
groupBy()
column alongside the aggregated columns. Any other columns will cause Lovefield to throw an exception. For example:
import * as lf from './node_modules/lovefield-ts/dist/es6/lf';
// This will throw an error.
const res = await db
.select(
order.col('customerId').as('cid'),
order.col('clerkId').as('agent'), // ERROR!
lf.fn.count().as('cnt'),
lf.fn.avg(amt).as('avg')
)
.from(order)
.groupBy(order.col('customerId'))
Parameterized query are very common for RDBMS programming, and Lovefield supports it. The concept is to bind some variables inside query at runtime. For example:
import * as lf from './node_modules/lovefield-ts/dist/es6/lf';
// SQL equivalent:
// SELECT * FROM order where customerId = ?0 AND date BETWEEN ?1 AND ?2
//
// There is a runtime cost to create a query object and its execution plans.
const query = db
.select()
.from(order)
.where(lf.op.and(
order.col('customerId').eq(lf.bind(0)),
order.col('date').between(lf.bind(1), lf.bind(2))
))
.limit(lf.bind(3))
.skip(lf.bind(4));
// We can use the same query object binding to different variables to avoid
// recreating execution plan. This is especially helpful to execute queries
// inside loops.
const res1 = await query
.bind([
/* 0 */ customer1,
/* 1 */ dateStart1,
/* 2 */ dateStart2,
/* 3 */ 10,
/* 4 */ 0])
.exec();
The bind()
function can also bind to a whole array, for example:
import * as lf from './node_modules/lovefield-ts/dist/es6/lf';
// The values expects an array.
const query = db.insert.into(items).values(lf.bind(0));
query.bind([[row]]).exec(); // Therefore we gave it an array as element 0.
So where exactly can I use bind()
? The table below summarize it for you.
Query type | Parameterizable places |
---|---|
all | search conditions |
select() |
limit() , skip() |
insert() , insertOrReplace() |
values() |
update() |
set() |
We have covered most of the fundamentals of using Lovefield so far. Lovefield actually offers more features than just these, which we will reveal them in the next session: Lovefield 102.