Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pagination in V2 engine, phase 1 #226

Merged
merged 47 commits into from
Mar 28, 2023
Merged
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
7e2dc80
Fixing integration tests broken during POC
Jan 28, 2023
d359751
Comment to clarify an exception.
Jan 31, 2023
9c3f7fe
Add support for paginated scroll request, first page.
Feb 6, 2023
1ee718b
Progress on paginated scroll request, subsequent page.
Feb 6, 2023
f3cade6
Move `ExpressionSerializer` from `opensearch` to `core`.
Yury-Fridlyand Feb 8, 2023
db342b8
Rename `Cursor` `asString` to `toString`.
Yury-Fridlyand Feb 8, 2023
c8f0935
Disable scroll cleaning.
Yury-Fridlyand Feb 8, 2023
fffc36d
Add full cursor serialization and deserialization.
Yury-Fridlyand Feb 8, 2023
d844977
Misc fixes.
Yury-Fridlyand Feb 8, 2023
333432e
Further work on pagination.
Yury-Fridlyand Feb 10, 2023
85c8825
Pagination fix for empty indices.
Yury-Fridlyand Feb 10, 2023
484a8fe
Fix error reporting on wrong cursor.
Yury-Fridlyand Feb 10, 2023
cccce53
Minor comments and error reporting improvement.
Yury-Fridlyand Feb 14, 2023
2895883
Add an end-to-end integration test.
Yury-Fridlyand Feb 15, 2023
dd6fcd6
Add `explain` request handlers.
Yury-Fridlyand Feb 16, 2023
2a19e56
Add IT for explain.
Yury-Fridlyand Feb 16, 2023
a3ef2bf
Address issues flagged by checkstyle build step (#229)
Feb 17, 2023
2d29549
Pagination, phase 1: Add unit tests for `:core` module with coverage.…
Yury-Fridlyand Mar 8, 2023
70ccfcb
Pagination, phase 1: Add unit tests for SQL module with coverage. (#239)
Yury-Fridlyand Mar 10, 2023
803f50e
Pagination, phase 1: Add unit tests for `:opensearch` module with cov…
Yury-Fridlyand Mar 10, 2023
27e1793
Fix the merges.
Yury-Fridlyand Mar 10, 2023
304616d
Fix explain.
Yury-Fridlyand Mar 10, 2023
1b5ab7e
Fix scroll cleaning.
Yury-Fridlyand Mar 10, 2023
f4ea4ad
Store `TotalHits` and use it to report `total` in response.
Yury-Fridlyand Mar 11, 2023
7f0acdd
Add missing UT for `:protocol` module.
Yury-Fridlyand Mar 11, 2023
2ce1626
Fix PPL UTs damaged in f4ea4ad8c.
Yury-Fridlyand Mar 11, 2023
b2e6e56
Minor checkstyle fixes.
Yury-Fridlyand Mar 11, 2023
c7ad219
Fallback to v1 engine for pagination (#245)
MaxKsyunz Mar 13, 2023
981bc25
Add UT with coverage for `toCursor` serialization.
Yury-Fridlyand Mar 13, 2023
960c039
Fix broken tests in `legacy`.
Yury-Fridlyand Mar 13, 2023
4f0c176
Fix getting `total` from non-paged requests and from queries without …
Yury-Fridlyand Mar 14, 2023
bdd52a0
Fix scroll cleaning.
Yury-Fridlyand Mar 15, 2023
a16332f
Fix cursor request processing.
Yury-Fridlyand Mar 15, 2023
9f9e873
Update ITs.
Yury-Fridlyand Mar 15, 2023
3340e38
Fix (again) TotalHits feature.
Yury-Fridlyand Mar 16, 2023
524f220
Fix typo in prometheus config.
Yury-Fridlyand Mar 16, 2023
281f3cd
Recover commented logging.
Yury-Fridlyand Mar 16, 2023
ca76e1b
Move `test_pagination_blackbox` to a separate class and add logging.
Yury-Fridlyand Mar 16, 2023
c8fcd4e
Address some PR feedbacks: rename some classes and revert unnecessary…
Yury-Fridlyand Mar 16, 2023
ec5fb40
Minor commenting.
Yury-Fridlyand Mar 16, 2023
4213388
Address PR comments.
Yury-Fridlyand Mar 24, 2023
ca20d16
Minor missing changes.
Yury-Fridlyand Mar 25, 2023
a58733e
Integration tests for fetch_size, max_result_window, and query.size_l…
Mar 27, 2023
75b8140
Remove `PaginatedQueryService`, extend `QueryService` to hold two pla…
Yury-Fridlyand Mar 27, 2023
ba81407
Move push down functions from request builders to a new interface.
Yury-Fridlyand Mar 28, 2023
cec012b
Some file moves.
Yury-Fridlyand Mar 28, 2023
9032b3b
Minor clean-up according to PR review.
Yury-Fridlyand Mar 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions core/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ dependencies {
testImplementation('org.junit.jupiter:junit-jupiter:5.6.2')
testImplementation group: 'org.hamcrest', name: 'hamcrest-library', version: '2.1'
testImplementation group: 'org.mockito', name: 'mockito-core', version: '3.12.4'
testImplementation group: 'org.mockito', name: 'mockito-inline', version: '3.12.4'
testImplementation group: 'org.mockito', name: 'mockito-junit-jupiter', version: '3.12.4'
}

Expand Down
8 changes: 8 additions & 0 deletions core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@
import org.opensearch.sql.ast.tree.Kmeans;
import org.opensearch.sql.ast.tree.Limit;
import org.opensearch.sql.ast.tree.ML;
import org.opensearch.sql.ast.tree.Paginate;
import org.opensearch.sql.ast.tree.Parse;
import org.opensearch.sql.ast.tree.Project;
import org.opensearch.sql.ast.tree.RareTopN;
Expand Down Expand Up @@ -85,6 +86,7 @@
import org.opensearch.sql.planner.logical.LogicalLimit;
import org.opensearch.sql.planner.logical.LogicalML;
import org.opensearch.sql.planner.logical.LogicalMLCommons;
import org.opensearch.sql.planner.logical.LogicalPaginate;
import org.opensearch.sql.planner.logical.LogicalPlan;
import org.opensearch.sql.planner.logical.LogicalProject;
import org.opensearch.sql.planner.logical.LogicalRareTopN;
Expand Down Expand Up @@ -529,6 +531,12 @@ public LogicalPlan visitML(ML node, AnalysisContext context) {
return new LogicalML(child, node.getArguments());
}

@Override
public LogicalPlan visitPaginate(Paginate paginate, AnalysisContext context) {
LogicalPlan child = paginate.getChild().get(0).accept(this, context);
acarbonetto marked this conversation as resolved.
Show resolved Hide resolved
return new LogicalPaginate(paginate.getPageSize(), List.of(child));
}

/**
* The first argument is always "asc", others are optional.
* Given nullFirst argument, use its value. Otherwise just use DEFAULT_ASC/DESC.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
import org.opensearch.sql.ast.tree.Kmeans;
import org.opensearch.sql.ast.tree.Limit;
import org.opensearch.sql.ast.tree.ML;
import org.opensearch.sql.ast.tree.Paginate;
import org.opensearch.sql.ast.tree.Parse;
import org.opensearch.sql.ast.tree.Project;
import org.opensearch.sql.ast.tree.RareTopN;
Expand Down Expand Up @@ -289,4 +290,8 @@ public T visitQuery(Query node, C context) {
public T visitExplain(Explain node, C context) {
return visitStatement(node, context);
}

public T visitPaginate(Paginate paginate, C context) {
return visitChildren(paginate, context);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
public class Query extends Statement {

protected final UnresolvedPlan plan;
protected final int fetchSize;

@Override
public <R, C> R accept(AbstractNodeVisitor<R, C> visitor, C context) {
Expand Down
48 changes: 48 additions & 0 deletions core/src/main/java/org/opensearch/sql/ast/tree/Paginate.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.ast.tree;

import java.util.List;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import lombok.ToString;
import org.opensearch.sql.ast.AbstractNodeVisitor;
import org.opensearch.sql.ast.Node;

/**
* AST node to represent pagination operation.
* Actually a wrapper to the AST.
*/
@RequiredArgsConstructor
@EqualsAndHashCode(callSuper = false)
@ToString
public class Paginate extends UnresolvedPlan {
Yury-Fridlyand marked this conversation as resolved.
Show resolved Hide resolved
@Getter
private final int pageSize;
private UnresolvedPlan child;

public Paginate(int pageSize, UnresolvedPlan child) {
this.pageSize = pageSize;
this.child = child;
}

@Override
public List<? extends Node> getChild() {
return List.of(child);
}

@Override
public <T, C> T accept(AbstractNodeVisitor<T, C> nodeVisitor, C context) {
return nodeVisitor.visitPaginate(this, context);
}

@Override
public UnresolvedPlan attach(UnresolvedPlan child) {
this.child = child;
return this;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.exception;

/**
* This should be thrown by V2 engine to support fallback scenario.
*/
public class UnsupportedCursorRequestException extends RuntimeException {
Yury-Fridlyand marked this conversation as resolved.
Show resolved Hide resolved
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.executor;

import org.opensearch.sql.ast.AbstractNodeVisitor;
import org.opensearch.sql.ast.Node;
import org.opensearch.sql.ast.expression.AllFields;
import org.opensearch.sql.ast.tree.Project;
import org.opensearch.sql.ast.tree.Relation;

/**
* Use this unresolved plan visitor to check if a plan can be serialized by PaginatedPlanCache.
MaxKsyunz marked this conversation as resolved.
Show resolved Hide resolved
* If plan.accept(new CanPaginateVisitor(...)) returns true,
* then PaginatedPlanCache.convertToCursor will succeed. Otherwise, it will fail.
* The purpose of this visitor is to activate legacy engine fallback mechanism.
* Currently, the conditions are:
* - only projection of a relation is supported.
* - projection only has * (a.k.a. allFields).
* - Relation only scans one table
* - The table is an open search index.
* So it accepts only queries like `select * from $index`
* See PaginatedPlanCache.canConvertToCursor for usage.
*/
public class CanPaginateVisitor extends AbstractNodeVisitor<Boolean, Object> {

@Override
public Boolean visitRelation(Relation node, Object context) {
if (!node.getChild().isEmpty()) {
// Relation instance should never have a child, but check just in case.
return Boolean.FALSE;
}

return Boolean.TRUE;
}

@Override
public Boolean visitChildren(Node node, Object context) {
return Boolean.FALSE;
}

@Override
public Boolean visitProject(Project node, Object context) {
// Allow queries with 'SELECT *' only. Those restriction could be removed, but consider
// in-memory aggregation performed by window function (see WindowOperator).
// SELECT max(age) OVER (PARTITION BY city) ...
var projections = node.getProjectList();
if (projections.size() != 1) {
return Boolean.FALSE;
}

if (!(projections.get(0) instanceof AllFields)) {
return Boolean.FALSE;
}

var children = node.getChild();
if (children.size() != 1) {
return Boolean.FALSE;
}

return children.get(0).accept(this, context);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import org.opensearch.sql.common.response.ResponseListener;
import org.opensearch.sql.data.model.ExprValue;
import org.opensearch.sql.data.type.ExprType;
import org.opensearch.sql.opensearch.executor.Cursor;
import org.opensearch.sql.planner.physical.PhysicalPlan;

/**
Expand Down Expand Up @@ -53,6 +54,8 @@ void execute(PhysicalPlan plan, ExecutionContext context,
class QueryResponse {
private final Schema schema;
private final List<ExprValue> results;
private final long total;
Yury-Fridlyand marked this conversation as resolved.
Show resolved Hide resolved
private final Cursor cursor;
}

@Data
Expand Down
162 changes: 162 additions & 0 deletions core/src/main/java/org/opensearch/sql/executor/PaginatedPlanCache.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.executor;

import com.google.common.hash.HashCode;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
import lombok.RequiredArgsConstructor;
import org.opensearch.sql.ast.tree.UnresolvedPlan;
import org.opensearch.sql.expression.NamedExpression;
import org.opensearch.sql.expression.serialization.DefaultExpressionSerializer;
import org.opensearch.sql.opensearch.executor.Cursor;
import org.opensearch.sql.planner.physical.PaginateOperator;
import org.opensearch.sql.planner.physical.PhysicalPlan;
import org.opensearch.sql.planner.physical.ProjectOperator;
import org.opensearch.sql.storage.StorageEngine;
import org.opensearch.sql.storage.TableScanOperator;

/**
* This class is entry point to paged requests. It is responsible to cursor serialization
* and deserialization.
*/
@RequiredArgsConstructor
public class PaginatedPlanCache {
Yury-Fridlyand marked this conversation as resolved.
Show resolved Hide resolved
public static final String CURSOR_PREFIX = "n:";
private final StorageEngine storageEngine;

public boolean canConvertToCursor(UnresolvedPlan plan) {
return plan.accept(new CanPaginateVisitor(), null);
}

/**
* Converts a physical plan tree to a cursor. May cache plan related data somewhere.
*/
public Cursor convertToCursor(PhysicalPlan plan) throws IOException {
if (plan instanceof PaginateOperator) {
var cursor = plan.toCursor();
if (cursor == null) {
return Cursor.None;
}
var raw = CURSOR_PREFIX + compress(cursor);
return new Cursor(raw.getBytes());
}
return Cursor.None;
}

/**
* Compress serialized query plan.
* @param str string representing a query plan
* @return str compressed with gzip.
*/
String compress(String str) throws IOException {
if (str == null || str.length() == 0) {
return "";
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(str.getBytes());
gzip.close();
return HashCode.fromBytes(out.toByteArray()).toString();
}

/**
* Decompresses a query plan that was compress with {@link PaginatedPlanCache#compress}.
* @param input compressed query plan
* @return decompressed string
*/
String decompress(String input) throws IOException {
if (input == null || input.length() == 0) {
return "";
}
GZIPInputStream gzip = new GZIPInputStream(new ByteArrayInputStream(
HashCode.fromString(input).asBytes()));
return new String(gzip.readAllBytes());
}

/**
* Parse `NamedExpression`s from cursor.
* @param listToFill List to fill with data.
* @param cursor Cursor to parse.
* @return Remaining part of the cursor.
*/
private String parseNamedExpressions(List<NamedExpression> listToFill, String cursor) {
var serializer = new DefaultExpressionSerializer();
if (cursor.startsWith(")")) { //empty list
Yury-Fridlyand marked this conversation as resolved.
Show resolved Hide resolved
return cursor.substring(cursor.indexOf(',') + 1);
}
while (!cursor.startsWith("(")) {
listToFill.add((NamedExpression)
serializer.deserialize(cursor.substring(0,
Math.min(cursor.indexOf(','), cursor.indexOf(')')))));
cursor = cursor.substring(cursor.indexOf(',') + 1);
}
return cursor;
}

/**
* Converts a cursor to a physical plan tree.
*/
public PhysicalPlan convertToPlan(String cursor) {
if (!cursor.startsWith(CURSOR_PREFIX)) {
throw new UnsupportedOperationException("Unsupported cursor");
Yury-Fridlyand marked this conversation as resolved.
Show resolved Hide resolved
}
try {
cursor = cursor.substring(CURSOR_PREFIX.length());
cursor = decompress(cursor);

// TODO Parse with ANTLR or serialize as JSON/XML
Yury-Fridlyand marked this conversation as resolved.
Show resolved Hide resolved
if (!cursor.startsWith("(Paginate,")) {
throw new UnsupportedOperationException("Unsupported cursor");
}
// TODO add checks for > 0
cursor = cursor.substring(cursor.indexOf(',') + 1);
final int currentPageIndex = Integer.parseInt(cursor, 0, cursor.indexOf(','), 10);

cursor = cursor.substring(cursor.indexOf(',') + 1);
final int pageSize = Integer.parseInt(cursor, 0, cursor.indexOf(','), 10);

cursor = cursor.substring(cursor.indexOf(',') + 1);
if (!cursor.startsWith("(Project,")) {
throw new UnsupportedOperationException("Unsupported cursor");
}
cursor = cursor.substring(cursor.indexOf(',') + 1);
if (!cursor.startsWith("(namedParseExpressions,")) {
throw new UnsupportedOperationException("Unsupported cursor");
}

cursor = cursor.substring(cursor.indexOf(',') + 1);
List<NamedExpression> namedParseExpressions = new ArrayList<>();
cursor = parseNamedExpressions(namedParseExpressions, cursor);

List<NamedExpression> projectList = new ArrayList<>();
if (!cursor.startsWith("(projectList,")) {
throw new UnsupportedOperationException("Unsupported cursor");
}
cursor = cursor.substring(cursor.indexOf(',') + 1);
cursor = parseNamedExpressions(projectList, cursor);

if (!cursor.startsWith("(OpenSearchPagedIndexScan,")) {
throw new UnsupportedOperationException("Unsupported cursor");
}
cursor = cursor.substring(cursor.indexOf(',') + 1);
var indexName = cursor.substring(0, cursor.indexOf(','));
cursor = cursor.substring(cursor.indexOf(',') + 1);
var scrollId = cursor.substring(0, cursor.indexOf(')'));
TableScanOperator scan = storageEngine.getTableScan(indexName, scrollId);

return new PaginateOperator(new ProjectOperator(scan, projectList, namedParseExpressions),
pageSize, currentPageIndex);
} catch (Exception e) {
throw new UnsupportedOperationException("Unsupported cursor", e);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
* Query id of {@link AbstractPlan}.
*/
public class QueryId {
public static final QueryId None = new QueryId("");
/**
* Query id.
*/
Expand Down
Loading