Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve] Gremlin query ids batch query to backend such as HBase/HStore #2674

Open
1 task done
JackyYangPassion opened this issue Oct 10, 2024 · 0 comments
Open
1 task done
Assignees
Labels
bug Something isn't working gremlin TinkerPop gremlin hbase HBase backend improvement General improvement

Comments

@JackyYangPassion
Copy link
Contributor

JackyYangPassion commented Oct 10, 2024

Bug Type (问题类型)

None

Before submit

  • 我已经确认现有的 IssuesFAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

  • Server Version: master
  • Backend: HBase/HStore
  • OS: CentOS 7.x

Expected & Actual behavior (期望与实际表现)

Motivation

1. 在时机使用中,发现批量点查性能差,无法满足业务需求
2. 提升批量查询性能

Example

1. 查询语句 g.V('id1','id2','id3')

当前版本问题 时间复杂度为O(n),
当使用RPC查询后端,性能很差,常规优化手法是批量查询下发存储层

Optimization plan

此处直接调用query.query(ids)

TODO: 待提交PR

GraphTransaction#queryVerticesByIds

protected Iterator<Vertex> queryVerticesByIds(Object[] vertexIds,
                                                  boolean adjacentVertex,
                                                  boolean checkMustExist,
                                                  HugeType type) {
        Query.checkForceCapacity(vertexIds.length);

        // NOTE: allowed duplicated vertices if query by duplicated ids
        List<Id> ids = InsertionOrderUtil.newList();
        Map<Id, HugeVertex> vertices = new HashMap<>(vertexIds.length);

        IdQuery query = new IdQuery(type);
        for (Object vertexId : vertexIds) {
            HugeVertex vertex;
            Id id = HugeVertex.getIdValue(vertexId);
            if (id == null || this.removedVertices.containsKey(id)) {
                // The record has been deleted
                continue;
            } else if ((vertex = this.addedVertices.get(id)) != null ||
                       (vertex = this.updatedVertices.get(id)) != null) {
                if (vertex.expired()) {
                    continue;
                }
                // Found from local tx
                vertices.put(vertex.id(), vertex);
            } else {
                // Prepare to query from backend store
                query.query(id);
            }
            ids.add(id);
        }

        if (!query.empty()) {
            // Query from backend store
            query.mustSortByInput(false);
            Iterator<HugeVertex> it = this.queryVerticesFromBackend(query);
            QueryResults.fillMap(it, vertices);
        }

        return new MapperIterator<>(ids.iterator(), id -> {
            HugeVertex vertex = vertices.get(id);
            if (vertex == null) {
                if (checkMustExist) {
                    throw new NotFoundException(
                            "Vertex '%s' does not exist", id);
                } else if (adjacentVertex) {
                    assert !checkMustExist;
                    // Return undefined if adjacentVertex but !checkMustExist
                    vertex = HugeVertex.undefined(this.graph(), id);
                } else {
                    // Return null
                    assert vertex == null;
                }
            }
            return vertex;
        });
    }
@JackyYangPassion JackyYangPassion added the bug Something isn't working label Oct 10, 2024
@dosubot dosubot bot added gremlin TinkerPop gremlin hbase HBase backend improvement General improvement labels Oct 10, 2024
@JackyYangPassion JackyYangPassion changed the title 【Improve】 gremlin query ids batch query to backend such as HBase/HStore [Improve] gremlin query ids batch query to backend such as HBase/HStore Oct 10, 2024
@JackyYangPassion JackyYangPassion changed the title [Improve] gremlin query ids batch query to backend such as HBase/HStore [Improve] Gremlin query ids batch query to backend such as HBase/HStore Oct 10, 2024
@JackyYangPassion JackyYangPassion self-assigned this Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gremlin TinkerPop gremlin hbase HBase backend improvement General improvement
Projects
None yet
Development

No branches or pull requests

1 participant