Slow down significantly when changing the limit return records from 1000 to 2000 #8703

ASXHOLE · 2018-12-17T05:36:35Z

OrientDB Version: 3.10

Java Version: JDK1.8.0_111

OS: CentOs 7

Expected behavior

Actual behavior

Why is the query very slow (more than 33 seconds) when limiting the return result within 2000 records, while very quick (winthin 1 second) when limiting the return result within 1000 records)?

Steps to reproduce

match
{Class:Con,as: Con1,where:(CON_NO='3620067544')}
.out('hasTag')
.in('hasTag')
{as:Con2 ,where:(CON_NO !='3620067544')}
return Con1.CON_NO,Con2.CON_NO
limit 5000

luigidellaquila · 2018-12-17T08:30:15Z

Hi @ASXHOLE

Any chance to have a dataset to reproduce the problem?

Thanks

Luigi

ASXHOLE · 2018-12-18T02:20:21Z

Hi, @luigidellaquila
I‘m so sorry , I made a mistake. My OrientDB Version is 3.0.10 .

My dataset is so large that i split it into a 10MB zip file（You need to remove the ".zip" suffix before unzip the files，like "OrientDB_data.zip.001"）

OrientDB_data.zip.001.zip
OrientDB_data.zip.002.zip
OrientDB_data.zip.003.zip
OrientDB_data.zip.004.zip
OrientDB_data.zip.005.zip
OrientDB_data.zip.006.zip
OrientDB_data.zip.007.zip
OrientDB_data.zip.008.zip
OrientDB_data.zip.009.zip
OrientDB_data.zip.010.zip

Tracy0231 · 2018-12-22T01:25:03Z

I want to know if any progress with this? I have the same problem...

luigidellaquila · 2018-12-27T10:37:13Z

Hi @ASXHOLE @Tracy0231

I checked your DB and I came to the conclusion that the main problem here is that the MATCH executor does some early loading of the connected patterns and in your specific case you have this situation:

node with CON_NO='3620067544' is connected to four other nodes: TAG_ID = 1001058, 1003050, 1003113, 1003118
node with TAG_ID = 1001058 is connected to ~1600 other nodes, so when you do LIMIT 1000 all they are loaded and checked, and you get the result in a few seconds
the other TAG_IDs have more than two million connected nodes each, so when you do LIMIT 2000 you load both TAG_ID = 1001058 and TAG_ID = 1003050, and the executor does an early loading of around 2.500.000 other nodes.

All this deserves some optimization, we can easily do some lazy loading and save a lot of resources (both in terms of execution time and memory consumption), I added it to my TODO list and hopefully I'll manage to do it soon.

In general, please also consider having many supernodes (ie. vertices with millions of connected edges) in a graph is not considered a very good practice

Thanks

Luigi

Resolves: #8703

luigidellaquila · 2018-12-27T16:16:01Z

Hi @ASXHOLE @Tracy0231

I just pushed a fix to 3.0.x branch, now the query takes around ten seconds also with LIMIT 2000

The fix will be released with v 3.0.13

Thanks

Luigi

luigidellaquila self-assigned this Dec 17, 2018

luigidellaquila added enhancement performance labels Dec 27, 2018

luigidellaquila added a commit that referenced this issue Dec 27, 2018

Add lazy traversal on MATCH execution

bd78881

Resolves: #8703

luigidellaquila closed this as completed in d615591 Dec 27, 2018

luigidellaquila added this to the 3.0.13 milestone Dec 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow down significantly when changing the limit return records from 1000 to 2000 #8703

Slow down significantly when changing the limit return records from 1000 to 2000 #8703

ASXHOLE commented Dec 17, 2018 •

edited

Loading

luigidellaquila commented Dec 17, 2018

ASXHOLE commented Dec 18, 2018

Tracy0231 commented Dec 22, 2018

luigidellaquila commented Dec 27, 2018

luigidellaquila commented Dec 27, 2018

Slow down significantly when changing the limit return records from 1000 to 2000 #8703

Slow down significantly when changing the limit return records from 1000 to 2000 #8703

Comments

ASXHOLE commented Dec 17, 2018 • edited Loading

OrientDB Version: 3.10

Java Version: JDK1.8.0_111

OS: CentOs 7

Expected behavior

Actual behavior

Steps to reproduce

luigidellaquila commented Dec 17, 2018

ASXHOLE commented Dec 18, 2018

Tracy0231 commented Dec 22, 2018

luigidellaquila commented Dec 27, 2018

luigidellaquila commented Dec 27, 2018

ASXHOLE commented Dec 17, 2018 •

edited

Loading