Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter Expression support #440

Open
wants to merge 46 commits into
base: develop-3.4
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
45f6fe5
Extend perf_SUITE
martinsumner Apr 12, 2024
4950700
Switch to re2 library
martinsumner Apr 12, 2024
be6cdc7
Load chunk in spawned processes
martinsumner Apr 15, 2024
60a4f78
Correctly account for pause
martinsumner Apr 15, 2024
ebc79f4
Merge branch 'mas-d31-i433perfSUITE' into mas-d31-i433re2
martinsumner Apr 15, 2024
3b98077
Support both re2 and pcre
martinsumner Apr 16, 2024
c4f428f
Add microstate accounting to profile
martinsumner Apr 18, 2024
56b2e1f
Add memory tracking during test phases
martinsumner Apr 18, 2024
31f2aa4
Merge branch 'mas-d31-i433perfSUITE' into mas-d31-i433re2
martinsumner Apr 18, 2024
611b8ac
Use macros instead (#437)
ThomasArts Apr 23, 2024
ce5db70
Don't print memory to screen in standard ct test
martinsumner Apr 23, 2024
8fe29eb
Merge branch 'mas-d31-i433perfSUITE' into mas-d31-i433re2
martinsumner Apr 23, 2024
287e8be
Initial support for capture in regex
martinsumner Apr 23, 2024
f54d75c
Create leveled_filter.erl
martinsumner Apr 24, 2024
4a4a7ca
Make binary comparisons in leveled_filter
martinsumner Apr 24, 2024
2b61a13
Add Filter Expression support
martinsumner Apr 27, 2024
bb7c889
Initial lexer/parser for eval pipeline
martinsumner Apr 29, 2024
80ce3c7
Update and extend the eval expression
martinsumner Apr 30, 2024
1563a7c
Extend testing
martinsumner Apr 30, 2024
59c556a
Add regex eval function to pipeline
martinsumner May 1, 2024
2680bf2
Remove re2
martinsumner May 17, 2024
2f13237
Modify filter
martinsumner May 20, 2024
1e22841
First version of QuickCheck properties for setop lang and filter lang
ThomasArts May 16, 2024
852e06a
Update filterlang eqc property
ThomasArts May 21, 2024
6a24913
Updated quickcheck for filter language
ThomasArts May 22, 2024
bb8f33d
improvements to implementation
ThomasArts May 22, 2024
a1ae05c
Also improve eval lexer
ThomasArts May 22, 2024
90c8d7b
Be more specific re integer types
martinsumner May 22, 2024
fcd195f
Basic unicode testing
martinsumner May 22, 2024
31b476a
Expand unicode test
martinsumner May 22, 2024
0b7df92
small shrink improvement for filterlang generator
ThomasArts May 23, 2024
2753de8
fix lexer to get rid of '<>'
ThomasArts May 23, 2024
13a1760
Add eqc property for evallang
ThomasArts May 23, 2024
1b320c6
runtime type errors in eval
ThomasArts May 23, 2024
9353f11
Update src/leveled_evallexer.xrl
martinsumner May 23, 2024
afee352
Update src/leveled_eval.erl
martinsumner May 23, 2024
787c424
Merge remote-tracking branch 'quviq/mas-d31-mas.i1433-filterexpressio…
martinsumner May 23, 2024
9984943
Revert "runtime type errors in eval"
martinsumner May 23, 2024
ac75fa9
Non-empty strings and type handling
martinsumner May 24, 2024
bf69dc5
Add support for combination queries on same snapshot point
martinsumner May 7, 2024
9c7ff44
Setop parser to use only set operation names
martinsumner May 24, 2024
044537a
Proposed changes to handle indexed sets
ThomasArts May 27, 2024
3afa892
quickcheck properties for setop language
ThomasArts May 27, 2024
c7dd6e7
Update setop to use maps as input
martinsumner May 28, 2024
1f97828
Merge branch 'develop-3.4' into mas-d31-mas.i1433-filterexpression
martinsumner Sep 27, 2024
8075de5
Remove duplications following merge
martinsumner Sep 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,6 @@ cover
cover_*
.eqc-info
leveled_data/*
compile_commands.json
*parser.erl
*lexer.erl
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,4 +86,4 @@ To have rebar3 execute the full set of tests, run:

For those with a Quickcheck license, property-based tests can also be run using:

```./rebar3 as eqc do eunit --module=leveled_simpleeqc, eunit --module=leveled_statemeqc```
```./rebar3 as eqc do eunit```
7 changes: 7 additions & 0 deletions include/leveled.hrl
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,13 @@
%% Inker key type used for tombstones
%%%============================================================================

%%%============================================================================
%%% Test
%%%============================================================================

-define(EQC_TIME_BUDGET, 120).

%%%============================================================================

%%%============================================================================
%%% Shared records
Expand Down
6 changes: 4 additions & 2 deletions rebar.config
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,13 @@

{xref_checks,
[undefined_function_calls,undefined_functions,
locals_not_used,
deprecated_function_calls, deprecated_functions]}.

{cover_excl_mods,
[testutil,
[leveled_filterlexer, leveled_filterparser,
leveled_evallexer, leveled_evalparser,
leveled_setoplexer, leveled_setopparser,
testutil,
appdefined_SUITE, basic_SUITE, iterator_SUITE,
perf_SUITE, recovery_SUITE, riak_SUITE, tictac_SUITE]}.

Expand Down
250 changes: 161 additions & 89 deletions src/leveled_bookie.erl
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@
-export([
book_returnfolder/2,
book_indexfold/5,
book_multiindexfold/5,
book_bucketlist/4,
book_keylist/3,
book_keylist/4,
Expand Down Expand Up @@ -684,7 +685,7 @@ book_returnfolder(Pid, RunnerType) ->
Constraint:: {Bucket, StartKey},
FoldAccT :: {FoldFun, Acc},
Range :: {IndexField, Start, End},
TermHandling :: {ReturnTerms, TermRegex}) ->
TermHandling :: {ReturnTerms, TermExpression}) ->
{async, Runner::fun(() -> term())}
when Bucket::term(),
Key :: term(),
Expand All @@ -696,8 +697,8 @@ book_returnfolder(Pid, RunnerType) ->
IndexVal::term(),
Start::IndexVal,
End::IndexVal,
ReturnTerms::boolean(),
TermRegex :: leveled_codec:regular_expression().
ReturnTerms::boolean()|binary(),
TermExpression :: leveled_codec:term_expression().

book_indexfold(Pid, Constraint, FoldAccT, Range, TermHandling)
when is_tuple(Constraint) ->
Expand All @@ -713,6 +714,23 @@ book_indexfold(Pid, Bucket, FoldAccT, Range, TermHandling) ->
leveled_log:log(b0019, [Bucket]),
book_indexfold(Pid, {Bucket, null}, FoldAccT, Range, TermHandling).

-type query()
:: {binary(), binary(), binary(), leveled_codec:term_expression()}.
-type combo_fun()
:: fun((list(sets:set(leveled_codec:key())))
-> sets:set(leveled_codec:key())).

-spec book_multiindexfold(
pid(),
leveled_codec:key(),
fun((leveled_codec:key(), leveled_codec:key(), term()) -> term()),
list({non_neg_integer(), query()}),
combo_fun())
-> {async, fun(() -> term())}.
book_multiindexfold(Pid, Bucket, FoldAccT, Queries, ComboFun) ->
RunnerType =
{multi_index_query, Bucket, FoldAccT, Queries, ComboFun},
book_returnfolder(Pid, RunnerType).

%% @doc list buckets. Folds over the ledger only. Given a `Tag' folds
%% over the keyspace calling `FoldFun' from `FoldAccT' for each
Expand Down Expand Up @@ -815,7 +833,7 @@ book_keylist(Pid, Tag, Bucket, KeyRange, FoldAccT) ->
StartKey :: Key,
EndKey :: Key,
Key :: term(),
TermRegex :: leveled_codec:regular_expression(),
TermRegex :: leveled_codec:term_expression(),
Runner :: fun(() -> Acc).
book_keylist(Pid, Tag, Bucket, KeyRange, FoldAccT, TermRegex) ->
RunnerType = {keylist, Tag, Bucket, KeyRange, FoldAccT, TermRegex},
Expand Down Expand Up @@ -1956,22 +1974,50 @@ snaptype_by_presence(false) ->
%% Get an {async, Runner} for a given fold type. Fold types have different
%% tuple inputs
get_runner(State, {index_query, Constraint, FoldAccT, Range, TermHandling}) ->
{IdxFld, StartT, EndT} = Range,
{Bucket, ObjKey0} =
case Constraint of
{B, SK} ->
{B, SK};
B ->
{B, null}
end,
StartKey =
leveled_codec:to_ledgerkey(Bucket, ObjKey0, ?IDX_TAG, IdxFld, StartT),
EndKey =
leveled_codec:to_ledgerkey(Bucket, null, ?IDX_TAG, IdxFld, EndT),
{StartKey, EndKey} = index_range(Constraint, Range),
SnapFun = return_snapfun(State, ledger, {StartKey, EndKey}, false, false),
leveled_runner:index_query(SnapFun,
{StartKey, EndKey, TermHandling},
FoldAccT);
leveled_runner:index_query(
SnapFun, {StartKey, EndKey, TermHandling}, FoldAccT);
get_runner(
State,
{multi_index_query, Bucket, FoldAccT, Queries, ComboFun}) ->
{FoldFun, InitAcc} = FoldAccT,
KeyFolder = fun(_B, K, Acc) -> [K|Acc] end,
QueryRunners =
lists:map(
fun({SetId, {IdxFld, StartTerm, EndTerm, Expr}}) ->
{SK, EK} =
index_range(
{Bucket, null}, {IdxFld, StartTerm, EndTerm}),
SnapFun =
return_snapfun(State, ledger, {SK, EK}, false, true),
{async, Runner} =
leveled_runner:index_query(
SnapFun, {SK, EK, {false, Expr}}, {KeyFolder, []}
),
{SetId, Runner}
end,
Queries
),
OverallRunner =
fun() ->
FinalSet =
ComboFun(
maps:from_list(
lists:map(
fun({SetId, R}) ->
{SetId, sets:from_list(R())}
end,
QueryRunners)
)
),
lists:foldl(
fun(K, Acc) -> FoldFun(Bucket, K, Acc) end,
InitAcc,
sets:to_list(FinalSet)
)
end,
{async, OverallRunner};
get_runner(State, {keylist, Tag, FoldAccT}) ->
SnapFun = return_snapfun(State, ledger, no_lookup, true, true),
leveled_runner:bucketkey_query(SnapFun, Tag, null, FoldAccT);
Expand All @@ -1980,91 +2026,102 @@ get_runner(State, {keylist, Tag, Bucket, FoldAccT}) ->
leveled_runner:bucketkey_query(SnapFun, Tag, Bucket, FoldAccT);
get_runner(State, {keylist, Tag, Bucket, KeyRange, FoldAccT, TermRegex}) ->
SnapFun = return_snapfun(State, ledger, no_lookup, true, true),
leveled_runner:bucketkey_query(SnapFun,
Tag, Bucket, KeyRange,
FoldAccT, TermRegex);
leveled_runner:bucketkey_query(
SnapFun, Tag, Bucket, KeyRange, FoldAccT, TermRegex);
%% Set of runners for object or metadata folds
get_runner(State,
{foldheads_allkeys,
Tag, FoldFun,
JournalCheck, SnapPreFold, SegmentList,
LastModRange, MaxObjectCount}) ->
get_runner(
State,
{foldheads_allkeys,
Tag, FoldFun,
JournalCheck, SnapPreFold, SegmentList,
LastModRange, MaxObjectCount}) ->
SnapType = snaptype_by_presence(JournalCheck),
SnapFun = return_snapfun(State, SnapType, no_lookup, true, SnapPreFold),
leveled_runner:foldheads_allkeys(SnapFun,
Tag, FoldFun,
JournalCheck, SegmentList,
LastModRange, MaxObjectCount);
get_runner(State,
{foldobjects_allkeys, Tag, FoldFun, SnapPreFold}) ->
get_runner(State,
{foldobjects_allkeys, Tag, FoldFun, SnapPreFold, key_order});
get_runner(State,
{foldobjects_allkeys, Tag, FoldFun, SnapPreFold, key_order}) ->
SnapFun = return_snapfun(State, store, no_lookup, true, SnapPreFold),
leveled_runner:foldobjects_allkeys(SnapFun, Tag, FoldFun, key_order);
get_runner(State,
{foldobjects_allkeys, Tag, FoldFun, SnapPreFold, sqn_order}) ->
SnapFun = return_snapfun(State, store, undefined, true, SnapPreFold),
leveled_runner:foldobjects_allkeys(SnapFun, Tag, FoldFun, sqn_order);
get_runner(State,
{foldheads_bybucket,
Tag,
BucketList, bucket_list,
FoldFun,
JournalCheck, SnapPreFold,
SegmentList, LastModRange, MaxObjectCount}) ->
leveled_runner:foldheads_allkeys(
SnapFun,
Tag,
FoldFun,
JournalCheck,
SegmentList,
LastModRange,
MaxObjectCount);
get_runner(State, {foldobjects_allkeys, Tag, FoldFun, SnapPreFold}) ->
get_runner(
State, {foldobjects_allkeys, Tag, FoldFun, SnapPreFold, key_order});
get_runner(State, {foldobjects_allkeys, Tag, FoldFun, SnapPreFold, Order}) ->
case Order of
key_order ->
SnapFun =
return_snapfun(State, store, no_lookup, true, SnapPreFold),
leveled_runner:foldobjects_allkeys(
SnapFun, Tag, FoldFun, key_order);
sqn_order ->
SnapFun =
return_snapfun(State, store, undefined, true, SnapPreFold),
leveled_runner:foldobjects_allkeys(
SnapFun, Tag, FoldFun, sqn_order)
end;
get_runner(
State,
{foldheads_bybucket,
Tag,
BucketList, bucket_list,
FoldFun,
JournalCheck, SnapPreFold,
SegmentList, LastModRange, MaxObjectCount}) ->
KeyRangeFun =
fun(Bucket) ->
{StartKey, EndKey, _} = return_ledger_keyrange(Tag, Bucket, all),
{StartKey, EndKey}
end,
SnapType = snaptype_by_presence(JournalCheck),
SnapFun = return_snapfun(State, SnapType, no_lookup, true, SnapPreFold),
leveled_runner:foldheads_bybucket(SnapFun,
Tag,
lists:map(KeyRangeFun, BucketList),
FoldFun,
JournalCheck,
SegmentList,
LastModRange, MaxObjectCount);
get_runner(State,
{foldheads_bybucket,
Tag,
Bucket, KeyRange,
FoldFun,
JournalCheck, SnapPreFold,
SegmentList, LastModRange, MaxObjectCount}) ->
leveled_runner:foldheads_bybucket(
SnapFun,
Tag,
lists:map(KeyRangeFun, BucketList),
FoldFun,
JournalCheck,
SegmentList,
LastModRange, MaxObjectCount);
get_runner(
State,
{foldheads_bybucket,
Tag,
Bucket, KeyRange,
FoldFun,
JournalCheck, SnapPreFold,
SegmentList, LastModRange, MaxObjectCount}) ->
{StartKey, EndKey, SnapQ} = return_ledger_keyrange(Tag, Bucket, KeyRange),
SnapType = snaptype_by_presence(JournalCheck),
SnapFun = return_snapfun(State, SnapType, SnapQ, true, SnapPreFold),
leveled_runner:foldheads_bybucket(SnapFun,
Tag,
[{StartKey, EndKey}],
FoldFun,
JournalCheck,
SegmentList,
LastModRange, MaxObjectCount);
get_runner(State,
{foldobjects_bybucket,
Tag, Bucket, KeyRange,
FoldFun,
SnapPreFold}) ->
leveled_runner:foldheads_bybucket(
SnapFun,
Tag,
[{StartKey, EndKey}],
FoldFun,
JournalCheck,
SegmentList,
LastModRange, MaxObjectCount);
get_runner(
State,
{foldobjects_bybucket,
Tag, Bucket, KeyRange,
FoldFun,
SnapPreFold}) ->
{StartKey, EndKey, SnapQ} = return_ledger_keyrange(Tag, Bucket, KeyRange),
SnapFun = return_snapfun(State, store, SnapQ, true, SnapPreFold),
leveled_runner:foldobjects_bybucket(SnapFun,
Tag,
[{StartKey, EndKey}],
FoldFun);
get_runner(State,
{foldobjects_byindex,
Tag, Bucket, {Field, FromTerm, ToTerm},
FoldObjectsFun,
SnapPreFold}) ->
leveled_runner:foldobjects_bybucket(
SnapFun, Tag, [{StartKey, EndKey}], FoldFun);
get_runner(
State,
{foldobjects_byindex,
Tag, Bucket, {Field, FromTerm, ToTerm},
FoldObjectsFun,
SnapPreFold}) ->
SnapFun = return_snapfun(State, store, no_lookup, true, SnapPreFold),
leveled_runner:foldobjects_byindex(SnapFun,
{Tag, Bucket, Field, FromTerm, ToTerm},
FoldObjectsFun);
leveled_runner:foldobjects_byindex(
SnapFun, {Tag, Bucket, Field, FromTerm, ToTerm}, FoldObjectsFun);
get_runner(State, {bucket_list, Tag, FoldAccT}) ->
{FoldBucketsFun, Acc} = FoldAccT,
SnapFun = return_snapfun(State, ledger, no_lookup, false, false),
Expand All @@ -2078,6 +2135,21 @@ get_runner(State, DeprecatedQuery) ->
get_deprecatedrunner(State, DeprecatedQuery).


index_range(Constraint, Range) ->
{IdxFld, StartT, EndT} = Range,
{Bucket, ObjKey0} =
case Constraint of
{B, SK} ->
{B, SK};
B ->
{B, null}
end,
StartKey =
leveled_codec:to_ledgerkey(Bucket, ObjKey0, ?IDX_TAG, IdxFld, StartT),
EndKey =
leveled_codec:to_ledgerkey(Bucket, null, ?IDX_TAG, IdxFld, EndT),
{StartKey, EndKey}.

-spec get_deprecatedrunner(book_state(), tuple()) ->
{async, fun(() -> term())}.
%% @doc
Expand Down Expand Up @@ -2707,7 +2779,7 @@ ttl_test() ->
KeyList = IndexFolder(),
?assertMatch(20, length(KeyList)),

{ok, Regex} = re:compile("f8"),
{ok, Regex} = leveled_util:regex_compile("f8"),
{async,
IndexFolderTR} = book_returnfolder(Bookie1,
{index_query,
Expand Down
Loading