Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MATCH...LIMIT n can't query the correct data #5436

Closed
thewkgithub opened this issue Mar 24, 2023 · 6 comments
Closed

MATCH...LIMIT n can't query the correct data #5436

thewkgithub opened this issue Mar 24, 2023 · 6 comments
Labels
type/question Type: question about the product

Comments

@thewkgithub
Copy link

thewkgithub commented Mar 24, 2023

Version: 3.3.0

我现在有一个Match查询,需求是这样子的:

在图空间内,有若干个Tag,每个Tag都有一个属性,比如就叫NAME吧。 我需要对NAME做模糊查询,然后返回前10条。

我的写法是这样子的。

MATCH (v) WITH v, properties(v) AS props, keys(properties(v)) AS keys LIMIT 10 WHERE [key in keys where key == 'NAME' AND props[key] CONTAINS ‘张三']

实际查询返回的情况是,这条语句,只在前10条记录里面去查询有没有NAME包含张三的记录。

我希望的,其实是在全表中查询,找到包含张三的记录,然后返回前10条给我。

类似的查询,在MYSQL等关系型数据库中,一般都是先查询到条件,再分页。

在Nebula里面,为什么会先分页再查询呢??

这是我的写法有问题吗?

@Sophie-Xie Sophie-Xie changed the title 关于分页查询的一个问题. MATCH...LIMIT n can't query the correct data Mar 24, 2023
@yixinglu
Copy link
Contributor

把 LIMIT 10 放到语句的最后面,上述的写法中 LIMIT 是修饰的 WITH,所以会如你所说,WITH 后只输出 10 条结果。

@Sophie-Xie Sophie-Xie added the type/question Type: question about the product label Mar 27, 2023
@thewkgithub
Copy link
Author

放到最后,会报
“Scan vertices or edges need to specify a limit number, or limit number can not push down.”

@thewkgithub
Copy link
Author

MATCH (v) WITH v, properties(v) AS props, keys(properties(v)) AS keys LIMIT 10000 WHERE [key in keys where key == 'NAME' AND props[key] CONTAINS ‘张三'] LIMIT 10

好像这样就行了。前面的LIMIT是限制查询的范围,后面的LIMIT是限制返回的结果集大小。

@yixinglu
Copy link
Contributor

放到最后,会报 “Scan vertices or edges need to specify a limit number, or limit number can not push down.”

在马上发布的新版本中取消了这个限制。你上面的解法也是个方法 😂

@wey-gu
Copy link
Contributor

wey-gu commented Mar 27, 2023

我们把文档里放进去这个蹩脚的写法是因为之前考虑到分布式的海量数据情形, NebulaGraph 禁止了无索引(tag)情况下,不带 LIMIT 的全扫描查询,而带有 LIMIT 扫数据可以在无 filter 时候数据下推,就允许了。

所以想要无索引,无 tag 线索去根据属性条件查询,就要先 LIMIT 取 M 个 数据,再在 M 个中,做条件过滤,这时候 M 的数值就只能人为去试才能获得”预期“的结果。

值得庆幸的是,我们在最近 master 里已经允许了全扫描,之后这个查询可以很自然去写了(下一个正式版本是 3.5.0,也可以用 nightly 版本做测试)

MATCH (v) WITH WHERE properties(v).name == "张三" RETURN v

请注意,这个查询更好的方式是

MATCH (v:player) WITH WHERE v.player.name == "张三" RETURN v.player.age, v.player.address

后者因为提供了更多明确的信息,查询的代价更好,而如果你有大量根据 plyaer.name 作为图查询的起始条件,强烈建议为 player.name 上创建索引。

推荐阅读:https://www.siwei.io/ngql-execution-plan/

@wey-gu
Copy link
Contributor

wey-gu commented Mar 27, 2023

放到最后,会报 “Scan vertices or edges need to specify a limit number, or limit number can not push down.”

在马上发布的新版本中取消了这个限制。你上面的解法也是个方法 😂

正在写的时候看到伊老师回了❤

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/question Type: question about the product
Projects
None yet
Development

No branches or pull requests

4 participants