Skip to content
This repository has been archived by the owner on Mar 12, 2024. It is now read-only.

Armor Performance Metrics

Austin Lee edited this page Jul 28, 2021 · 6 revisions

Write Performance Tests

Our first set of write benchmarks are focused on burst writing at various levels of locking.

Write large amount of rows in the shortest amount of time.

  • Tables were initially empty in all tests.
  • 840K json messages on s3
  • Each msg is converted into an Armor entity object of 1 or more rows of 3 columns.
  • Writes are done in at most 1K batches.
  • There are 5 r5a.4xlarge ec2 instances with 16 threads allocated for ingestion each.
  • The targeted table is 51 million row table of 3 string columns all of similar length (approximately 20 bytes).
  • Queues are defined in dynamodb.
Test Number Description Number of messages Rows Duration
1 Table Lock 840,032 50,630,411 3 hours 50 minutes
1 Shard Lock 841,332 51,026,184 30 minutes

Test 1 (Lock on a table)

Our first test was locking and running on a single table. Our locking was done externally using redis to establish the lock. Once locked for each run we download 1K messages and trigger the write. This effectively allowed only 1 instance to process the table. However, since it has the lock, all 16 threads on the instance were used to write to the table.

3 hrs 50 minutes

Test 2 (Lock on a shard)

We altered the test run to begin group and locking at the shard level. Thus for a 16 shard tables where each shard goes from A-Z. Our group/lock strategy was based on a key combination of table and shard, Ex: table_A, table_B etc. Unlike Test 1 where the we could at most download 1K, our download rate increased since to a max potential of 16K messages. With this new approach our ability to create the table became. Also with the shard approach the load was evenly spread across all instances for greater efficiency.

30 minutes

Real world multi-tenant ingestion

These metrics look to see how a steady stream of real word data flowing across many tenants will performs.

WIP.

Read performance metrics

1 billion table query read test

These tests below we ran hooked up Armor to presto and started to execute and run queries to not only test individual query performance but also concurrent queries. Below are the results.

select avg(compliance)
FROM (
    select assetid, (passing/total) as compliance FROM (
        select assetid, SUM(CASE result WHEN 'P' THEN 1.0 WHEN 'F' THEN 0.0 ELSE 0.0 END) AS passing, count(assetid) as total
        from armor."org111222".target_tablename GROUP BY assetid )
)
Concurrent Queries query time(seconds)
1 7
5 30
10 55
20 105
select avg(compliance) FROM (
    select policyId, ruleId, (passing/total) as compliance FROM ( 
        select policyId, ruleId, SUM(CASE result WHEN 'P' THEN 1.0 WHEN 'F' THEN 0.0 ELSE 0.0 END) AS passing, count(*) as total
        from armor."org111222".target_tablename
        GROUP BY policyId, ruleId
    )
)
Concurrent Queries query time(seconds)
1 9
5 40
10 77
20 156