Skip to content
This repository has been archived by the owner on Mar 12, 2024. It is now read-only.

Time Series

Austin Lee edited this page Mar 6, 2021 · 7 revisions

Time Series data

ArmorWriters provide APIs to allow data to be written in a time series style format that is multi-tenant. This is achieved through via hierarchy. For example both the provided File and S3 write stores, store data in this manner.

/root/tenant/table/interval/intervalStart/
/root/tenant1/tv_table/weekly/2020-09-21T00:00:00Z/....
/root/tenant1/tv_table/weekly/2020-09-28T00:00:00Z/....
/root/tenant1/tv_table/weekly/2020-10-05T00:00:00Z/....

Lets break down the time series pieces in detail.

Interval

All data for a table is automatically grouped into a type of interval. The interval defines how a table maybe broken up by time. For example, weekly intervals span every 7 days, so if a request specifies the write is for weekly then the write will be written under the weekly path.

armorWriter.write(..., Interval.WEEKLY, ...);

If you wanted to support more intervals for the same entity data then you'd repeat the call with a different interval parameter like Interval.SINGLE, Interval.MONTHLY etc. Below are the given predefined intervals.

Interval Description
SINGLE Single interval, can also think of this as "none" or "current" it will hold the latest of each entity
HOURLY Hourly interval
DAILY Daily interval
WEEKLY Weekly interval
MONTHLY Monthly interval
YEARLY Yearly interval

You can also define custom interval where its based off the a given minutes range. More on this below.

NOTE:

  • These predefined intervals are Calendar based intervals so it will account leap years, different days of the month etc
  • Weekly intervals start on the first business day which Monday

Interval Starts

Within each interval there is an interval start which defines the start of an interval. Thus under an interval you should have one more start interval folders listed out as your write. The start interval format is stored in UTC as ISO-8601 (2020-09-21T00:00:00Z) format. Also just like Interval you must pass a value for the start interval. However, unlike Interval the value for start interval doesn't have to be the exact start interval time. Rather it can be the exact timestamp of the event triggering the right. For example you can do do this

armorWriter.write(..., Interval.WEEKLY, Instant.now(), ....);
armorWriter.write(..., Interval.WEEKLY, Instant.parse("02-02-2021"), ....);

In this case, the first write will write to the current week while the second write will write to the week Monday of week 02.02.2021. The caller isn't required to calculate the interval start value, rather it will be calculated for you.

Table

Going back to table, keep mind table is a collection of intervals tied together. Thus there is no special table_a_monthly or table_a_weekly. However, even though the table appears as one among the intervals underneath it really be all stored individually.

Performance

Outside of writing and building the entities, performance is really determined by the underlying storage technology. Currently there is only s3 and file system store, so performance is really tied to how fast s3 can store or what kind of disks the file system store is using.

Timeseries storage patterns

Event based

Snapshot point in time

Data retention

Reading intervals

Currently only via presto are you able to see and query the interval and interval starts values. For other apis, you can probably use the Store interfaces to list or do any filtering based off interval and interval starts.