-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Allow row_range
to be treated as a clause
#864
Conversation
40aedec
to
b83701b
Compare
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
b83701b
to
45ca483
Compare
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
639b4a5
to
a811ae1
Compare
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
This is equivalent but this has the benefit of allowing efining defaults value in Python 3.6, which aren't available via the `namedtuple` factory (only have been added in Python 3.7). See: https://docs.python.org/3/library/typing.html#typing.NamedTuple See: https://docs.python.org/3/library/collections.html#collections.namedtuple Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
a811ae1
to
208ab56
Compare
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
This is suboptimal, but for now slicing on columns makes the composition with other clauses (like FilterClause) impossible. Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
class PythonRowRangeClause(NamedTuple): | ||
row_range_type: _RowRangeType = None | ||
n: int = None | ||
start: int = None | ||
end: int = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: using the NamedTuple
class allows specifying default values to use either:
PythonRowRangeClause(row_range_type=_RowRangeType.HEAD, n=n))
PythonRowRangeClause(row_range_type=_RowRangeType.TAIL, n=n))
PythonRowRangeClause(start=start, end=end))
@@ -1498,7 +1520,8 @@ def _get_read_query(self, date_range: Optional[DateRangeInput], row_range, colum | |||
read_query.row_filter = _normalize_dt_range(date_range) | |||
|
|||
if row_range is not None: | |||
read_query.row_range = row_range | |||
total_n_rows = self.get_num_rows(symbol=symbol, as_of=as_of, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: get_description
is an alternative, but this is unreachable in this scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason the logic in _normalize_row_range
was in the C++ layer was because methods like get_num_rows
involve reading the index key, so it will be read twice with this implementation. Can we please refactor back to the state where it is only read once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I did this change was to get ride of calculate_row_filter
and SignedRowRange
in cpp/arcticdb/pipeline/query.hpp
.
I first tried to have this logic in the C++ side but I do not know what is the best way to get the total number of lines as soon as possible unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be read twice with this implementation
How costly is this? Is there a way to get the total number of rows without this cost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted back with e2b97e9.
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Alex Owens <alex.owens@man.com>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
ee78af7
to
8e4a390
Compare
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
Reference Issues/PRs
Fixes #747.
What does this implement/fix? How does it work (high level)? Highlight notable design decisions.
Any other comments?
Checklist
Checklist for code changes...