Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addEdge with orid to minimize dependency on referenced Vertex. #2132

Closed
acmeguy opened this issue Mar 17, 2014 · 47 comments
Closed

addEdge with orid to minimize dependency on referenced Vertex. #2132

acmeguy opened this issue Mar 17, 2014 · 47 comments
Assignees
Milestone

Comments

@acmeguy
Copy link

acmeguy commented Mar 17, 2014

Allow edges to be created with ORID:

  • someVertex.addvertex(type, orid)
  • graph.addvertex(type,source_orid,target_orid)

This would mean that I'm only referencing the target vertex and that NO other update is requested for the target vertex

Resolved during during commit:

  • The is resolved to the latest version of the target vertex (JIT)
  • The edge is created (JIT)
  • Retry this X times if the target vertex is locked or on concurrent exception

See this discussion:
https://groups.google.com/forum/?fromgroups=#!topic/orient-database/qbXlDJbK8m8

@lvca
Copy link
Member

lvca commented Mar 17, 2014

It's pretty easy to implement the first part, but for transactions this is harder because all the operations happen at client-side and at commit() time we flush the transaction to the server/storage. So OrientDB is MVCC by design. We should support a concept of "operation" that the client should execute at server side level only.

@andrii0lomakin
Copy link
Member

Yes, I thought abut the same.
Even without tx, it will be resource consuming to retry on client side. So
it should be special server side operation.

On Mon, Mar 17, 2014 at 12:44 PM, Luca Garulli notifications@github.comwrote:

It's pretty easy to implement the first part, but for transactions this is
harder because all the operations happen at client-side and at commit()
time we flush the transaction to the server/storage. So OrientDB is MVCC by
design. We should support a concept of "operation" that the client should
execute at server side level only.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2132#issuecomment-37802753
.

Best regards,
Andrey Lomakin.

Orient Technologies
the Company behind OrientDB

@acmeguy
Copy link
Author

acmeguy commented Mar 17, 2014

I'm running in embed mode but that should only affect this slightly, if at all.

If the retry happens within the scope of the commit() and minimizes the "lock" time for the references vertex, then I hope we will still see improvement.

The best way would be to have the server add edges - for these light cases - with minimum overhead.

Regards,
-Stefán

@lvca
Copy link
Member

lvca commented Mar 22, 2014

We could create a new version of the method with an additional parameter "lockingStrategy" of type OStorage.LOCKING_STRATEGY :

public enum LOCKING_STRATEGY {
  NONE, DEFAULT, KEEP_SHARED_LOCK, KEEP_EXCLUSIVE_LOCK
}

In case it's KEEP_EXCLUSIVE_LOCK we should load the record by ID, by using this strategy (we recently added this feature) and let to the commit() the unlock of locked records. So we'd need an additional map<rid,lock_type> in the transaction object.

This would work only with local/plocal/memory connections. For the remote case the locking would need more work (new lock/unlock commands, etc).

WDYT about this solution? Are you working with plocal?

@lvca
Copy link
Member

lvca commented Mar 22, 2014

Or maybe we should support the PESSIMISTIC transaction (now we only have optimistic) where all the records are locked:

  • reads, lock record in shared mode
  • updates and deletes lock records in shared mode

@andrii0lomakin
Copy link
Member

Guys, a question.
Why we push lock instead of lock free approach ? Usually writes are only 30% of operations and reads the rest 70%. Also lets suppose that we use locks it means that we use single thread approach in nutshell. Lets suppose than that we have 8 cores CPU it means that all 70% of operations will be 8 times slower, because we would like to speed up 30% of operations.
Is it reasonable way where should evolve our design ? From my point of view usage of such approach means lost of scalabiltiy feature of database. Which means that our DB will have the same speed if we run on developer PC and on million dollars mainframe with 40 cores.

WDYT ?

@lvca
Copy link
Member

lvca commented Mar 24, 2014

In general I agree with you but:

  1. not all the use cases can be retried (CAS). I'm thinking to X operations where you should redo the entire tx in case of conflict.
  2. on high-contention against the same resource, lock is faster

That's why I'd like the user can choose the best approach for his use case.

@andrii0lomakin
Copy link
Member

Yes, but support of pessimistic transactions means such kind of behavior as I described above.
So if introduction of pessimistic transaction will lead to single thread behavior we can not remove it because users use it, and can support it because it means low scalabiltiy, so we may be out of the market.

@acmeguy
Copy link
Author

acmeguy commented Mar 25, 2014

Hi,

Please excuse me for the lack of knowledge required to participate in this
discussion.

I do how ever understand the need for the 3 sided updates on bi-direction
edge creations and also understand the benefits of embedded indexes.

It seems to me that the drawback of this can be minimized by treating the
vertex-record and the vertex-edge-lists as two slightly separated entities
and that the source of this, and perhaps other things as well, is the close
coupling of the two.

Regards,
-Stefán

On Tue, Mar 25, 2014 at 6:47 AM, Andrey Lomakin notifications@github.comwrote:

Yes, but support of pessimistic transactions means such kind of behavior
as I described above.
So if introduction of pessimistic transaction will lead to single thread
behavior we can not remove it because users use it, and can support it
because it means low scalabiltiy, so we may be out of the market.

Reply to this email directly or view it on GitHubhttps://github.com//issues/2132#issuecomment-38535289
.

@lvca
Copy link
Member

lvca commented Mar 25, 2014

So what's the solution? To retry the entire tx? How if the tx is lead by the client? Think to this case:

tx.begin()
try{
  Vertex client = loadVertexClient( name );
  if( client.getProperty("amount") < 1000 )
    client.setProperty( "amount", client.amount() + 100 );
  tx.commit();
} catch ( Exception e )
  tx.rollback();
} 

@kowalot
Copy link
Contributor

kowalot commented Mar 25, 2014

I agree that in theory lock free is better (especially for scalability), but I draw my conclusion on empirical usage of the OrientDB as user. Unfortunately lack of pessimistc locking (at least for updates in COMMIT ) makes it extremely hard to build fast business solution especially with remote database. If you want to have No.1 popular and fast (from user perspective) graph database, it shouldn't be.
If you can't do pessimistic locking in transaction please reconsider at least partial COMMIT updates with retrying. It will boost addEdge and couple of other scenarios.

@lvca
Copy link
Member

lvca commented Mar 25, 2014

@kowalot I agree with you, it's much better to provide a tool the user decide or not to use. This is the sake of OrientDB and NoSQL in general. What @Laa is referring, maybe, is the worrying about some new constraints that will avoid to scale up, but I cannot see them right now. I've a separate branch "tx-pessimistic" where the OIdentifiable has 2 new methods:

lock(boolean exclusive);
unlock();

So now the user have control of locking, but it's more for OrientDB internal APis. All th elocks are stored in the tx, so at close all the locks are freed.

This is a POC yet to introduce a pessimistic or at least hybrid approach to support something like addEdge() at the server side.

lvca added a commit that referenced this issue Mar 31, 2014
…mmand: [RETRY <retry> [WAIT <pauseBetweenRetriesInMs]]
@lvca
Copy link
Member

lvca commented Mar 31, 2014

Good news. I've fixed this issue by introducing 2 new keywords in create edge SQL command:

[RETRY <retry> [WAIT <pauseBetweenRetriesInMs]]

Where:

  • retry is the number of time it retries in case of conflict, default is 0
  • pauseBetweenRetriesInMs, is the ms to wait in case of conflict before retry again

@acmeguy This command can be executed in transactions, so I think covers 100% of your needs. Let me know the results of it ASAP, in the meanwhile I'll write a couple of test cases before to close it.

@acmeguy
Copy link
Author

acmeguy commented Mar 31, 2014

Hi,

How do I use this from the Java API?
And does it re-fetch the target vertexes during retry? (The problem is the
old version number)

Regards,
-Stefán

On Mon, Mar 31, 2014 at 12:21 AM, Luca Garulli notifications@github.comwrote:

Good news. I've fixed this issue by introducing 2 new keywords in create
edge SQL command:

[RETRY [WAIT <pauseBetweenRetriesInMs]]

Where:

  • retry is the number of time it retries in case of conflict,
    default is 0
  • pauseBetweenRetriesInMs, is the ms to wait in case of conflict
    before retry again

@acmeguy https://github.com/acmeguy This command can be executed in
transactions, so I think covers 100% of your needs. Let me know the results
of it ASAP, in the meanwhile I'll write a couple of test cases before to
cose it.

Reply to this email directly or view it on GitHubhttps://github.com//issues/2132#issuecomment-39045293
.

@lvca
Copy link
Member

lvca commented Mar 31, 2014

It reloads the vertices in case of conflict. Use it with g.command ():

g.command( new OCommandSQL("create edge #10:20 to #10:39 retry 100") ).execute ();

@acmeguy
Copy link
Author

acmeguy commented Mar 31, 2014

Hi,

This is understood.

Does it also work with temporary vertex IDs from within the transaction?

Regards,
-Stefán

On Mon, Mar 31, 2014 at 7:45 AM, Luca Garulli notifications@github.comwrote:

It reloads the vertices in case of conflict. Use it with g.command ():

g.command( new OSQLCommand("create edge #10:20 to #10:39 retry 100") ).execute ();

Reply to this email directly or view it on GitHubhttps://github.com//issues/2132#issuecomment-39061198
.

@lvca
Copy link
Member

lvca commented Mar 31, 2014

Nope because temporary rids scope are transaction only, and commands are executed on the server side, so out of transaction. But you could execute this command on the server side:

g.command( new OCommandSQL("create edge e
         from (create vertex v set name = 'Luca')
         to #10:39 retry 100") ).execute();

So the new created vertex rid will go as parameter of create edge.

@acmeguy
Copy link
Author

acmeguy commented Mar 31, 2014

? OCommandSQL ?

On Mon, Mar 31, 2014 at 12:01 PM, Luca Garulli notifications@github.comwrote:

Nope because temporary rids scope are transaction only, and commands are
executed on the server side, so out of transaction. But you could execute
this command on the server side:

g.command( new OSQLCommand("create edge e from (create vertex v set name = 'Luca') to #10:39 retry 100") ).execute ();

So the new created vertex rid will go as parameter of create edge.

Reply to this email directly or view it on GitHubhttps://github.com//issues/2132#issuecomment-39080333
.

@acmeguy
Copy link
Author

acmeguy commented Mar 31, 2014

How do I set the Edge Type in this command?

On Mon, Mar 31, 2014 at 12:13 PM, Stefan Baxter stebax@gmail.com wrote:

? OCommandSQL ?

On Mon, Mar 31, 2014 at 12:01 PM, Luca Garulli notifications@github.comwrote:

Nope because temporary rids scope are transaction only, and commands are
executed on the server side, so out of transaction. But you could execute
this command on the server side:

g.command( new OSQLCommand("create edge e from (create vertex v set name = 'Luca') to #10:39 retry 100") ).execute ();

So the new created vertex rid will go as parameter of create edge.

Reply to this email directly or view it on GitHubhttps://github.com//issues/2132#issuecomment-39080333
.

@lvca
Copy link
Member

lvca commented Mar 31, 2014

Look at sql create edge command guide:
https://github.com/orientechnologies/orientdb/wiki/SQL-Create-Edge

@ppeddi
Copy link

ppeddi commented Mar 31, 2014

Is there a blueprints alternative to this? We are using blueprints and I am running into the same concurrent modification due to version mismatch. I tried to use locks around graph.addEdge() but that didn't work since the version is incremented at the time of commit? Concurrent modification exception is also occuring at the time of commit.

Caused by: com.orientechnologies.orient.core.exception.OConcurrentModificationException: Cannot UPDATE the record #12:213 because the version is not the latest. Probably you are updating an old record or it has been modified by another user (db=v6 your=v5)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.updateRecord(OLocalPaginatedStorage.java:1328)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.commitEntry(OLocalPaginatedStorage.java:1668)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.commit(OLocalPaginatedStorage.java:1571)
at com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:185)
at com.orientechnologies.orient.core.db.record.ODatabaseRecordTx.commit(ODatabaseRecordTx.java:116)

@lvca
Copy link
Member

lvca commented Mar 31, 2014

This command is to overcome this problem: it takes care of reloading of both vertices in case of conflict.

@ppeddi
Copy link

ppeddi commented Mar 31, 2014

OK thanks. Does it mean I have to use SQL instead of blueprints for adding edges? We are trying to keep our code as much agnostics to underlying datastore as possible so we are sticking with blueprints.

@acmeguy
Copy link
Author

acmeguy commented Mar 31, 2014

Hi,

This seems to introduces another problem for me.

On vertex is most commonly a new one and using this approach means I have
to save it first (a batch of events) and then re-iterate the whole patch to
create light edges.

Is it not possible to have the server add "empty edges" this way by default
so that I can keep adding them in the transaction as the rest of the data?

Regards,
-Stefán

On Mon, Mar 31, 2014 at 3:49 PM, ppeddi notifications@github.com wrote:

OK thanks. Does it mean I have to use SQL instead of blueprints for adding
edges? We are trying to keep our code as much agnostics to underlying
datastore as possible so we are sticking with blueprints.

Reply to this email directly or view it on GitHubhttps://github.com//issues/2132#issuecomment-39104597
.

@lvca
Copy link
Member

lvca commented Mar 31, 2014

Could you execute everything at the server side with a SQL script?

    String cmd = "";
    cmd += "begin";
    cmd += "\nlet a = create vertex set script = true";
    cmd += "\nlet b = select from v limit 1";
    cmd += "\ncreate edge from $a to $b retry 100";
    cmd += "\ncommit";
    cmd += "\nreturn $a";

    Object result = database.command(new OCommandScript("sql", cmd)).execute();

Look at issue #2176.

@ppeddi
Copy link

ppeddi commented Apr 1, 2014

do you know if I can use the combination of blueprints graph and SQL with in a single transaction. For example, I can use blueprints to create vertices and use SQL with retry feature for edges.

//start transaction
graph.addVertex(v1);
//we know vertex v2 is already in the database so no need to add
database.command(new OCommandScript("sql", "create edge from v1 to v2 retry 100"));

//commit the transaction. WHAT AND HOW DO I COMMIT? I HAVE graph object and database object.

@lvca
Copy link
Member

lvca commented Apr 2, 2014

I suggest you to use last 1.7-SNAPSHOT (few minutes ago) and do this supposing v2 you already know has a rid = #11:232:

String sql = "begin;let v1 = create vertex V;create edge from $v1 to (select from #11:232);commit retry 100";
database.command(new OCommandScript("sql", sql));

In this way you apply the retry to the commit and the entire tx will be repeated in case of conflict. Let me know how it performs under heavy load. Or you could also do:

String sql = "begin;let v1 = create vertex V;let v2 = select from #11:232 lock record;create edge from $v1 to $v2;commit";
database.command(new OCommandScript("sql", sql));

@lvca
Copy link
Member

lvca commented Apr 2, 2014

@lvca lvca added this to the 1.7 milestone Apr 2, 2014
@lvca lvca self-assigned this Apr 2, 2014
@lvca lvca closed this as completed Apr 2, 2014
@lvca
Copy link
Member

lvca commented Apr 17, 2014

@enisher can you provide an ETA for this issue?

@acmeguy
Copy link
Author

acmeguy commented Feb 2, 2015

Hi, Can this be done using the Java API?

@lvca
Copy link
Member

lvca commented Feb 2, 2015

Ys, it's already there.

@acmeguy
Copy link
Author

acmeguy commented Feb 2, 2015

already where? :)

@lvca
Copy link
Member

lvca commented Feb 2, 2015

@acmeguy We implemented it in SQL command. Did you mean the addEdge() in Java API? You can do:

Edge e = g.addEdge( null, new OrientVertex( g, "#12:33" ) ), new OrientVertex( g, "#134:9348" ), "Friend" );

@acmeguy
Copy link
Author

acmeguy commented Feb 2, 2015

Luca, this line makes no sense :)

I'm trying to add edges to existing vertices from a new vertex without updating the version number of the existing vertex. I thought I had this working but it's updating version numbers like crazy.

My line is:
newVertex.addEdge(type, existingVertex, properties);

@lvca
Copy link
Member

lvca commented Feb 2, 2015

This is another problem: set -Dindex.embeddedToSbtreeBonsaiThreshold=-1 and you always have a separate structure to handle edges and the vertices will be not updated anymore.

@acmeguy
Copy link
Author

acmeguy commented Feb 2, 2015

I have that set already: OGlobalConfiguration.RID_BAG_EMBEDDED_TO_SBTREEBONSAI_THRESHOLD.setValue(1);

Sorry, I thought the two were linked.

is this irrelevant:
RID_BAG_SBTREEBONSAI_TO_EMBEDDED_THRESHOLD

@lvca
Copy link
Member

lvca commented Feb 2, 2015

Sorry, copy and paste: you should set it to -1 that means ALWAYS. The value RID_BAG_SBTREEBONSAI_TO_EMBEDDED_THRESHOLD is for the opposite: when downscale to embedded from a tree.

@acmeguy
Copy link
Author

acmeguy commented Feb 2, 2015

yeah, I had it at -1 :)

@stevehu
Copy link

stevehu commented Mar 24, 2015

2.1-SNAPSHOT I have OGlobalConfiguration.RID_BAG_EMBEDDED_TO_SBTREEBONSAI_THRESHOLD.setValue(-1);

in my factory when getting OrientGraph and the vertex version still increasing and constantly getting concurrency modification exceptions and transaction rollback. I am using Java API and I don't think it does retry if it fails. In my use case, I consider vertex changed only when properties are changed but not edges. Could anyone point me to the right place on how the above configuration works? I will try to use -D in command line but it is not ideal. Thanks.

@lvca
Copy link
Member

lvca commented Mar 24, 2015

Are you using OrientDB embedded or remote? In case of remote, where did you set that -D ?

@stevehu
Copy link

stevehu commented Mar 24, 2015

Embedded using plocal. Thank.
On Mar 24, 2015 11:17 AM, "Luca Garulli" notifications@github.com wrote:

Are you using OrientDB embedded or remote? In case of remote, where did
you set that -D ?


Reply to this email directly or view it on GitHub
#2132 (comment)
.

@lvca
Copy link
Member

lvca commented Mar 24, 2015

Are you working with distributed server (replication)? However, could you provide the code responsible of the exception?

@stevehu
Copy link

stevehu commented Mar 25, 2015

No. I only have one server with embedded plocal server. I tried to put
-Dindex.embeddedToSbtreeBonsaiThreshold=-1
as VM Options but it still not working.
"@type": "d",
"@Rid": "#14:0",
"@Version": 276, // should be 0
"@Class": "User",

My code is complicated and there are several classes involved. I will create a test case based on 2.1-SNAPSHOT. Thanks.

@stevehu
Copy link

stevehu commented Mar 25, 2015

Here is the three line of code which is called by passing in different data with the same userId. The third line increase the user version for each edge creation.

        Vertex createUser = graph.getVertexByKey("User.userId", data.remove("createUserId"));
        rule = graph.addVertex("class:Rule", data);
        createUser.addEdge("Create", rule);

@lvca
Copy link
Member

lvca commented Mar 27, 2015

I've just added 2 test cases in ec23c69 commit and adding elements in tree ridbag doesn't increment version. Please could you send us a running test case to demonstrate the problem?

@stevehu
Copy link

stevehu commented Mar 29, 2015

@lvca Thanks for the quick response. I think your test case only focuses on ridbag but it doesn't mean version won't be changed by other functions when edge is added. I have created a test case here to show you that version is increased to 3. Thanks.

@lukeasrodgers
Copy link

Thanks for this fix -- it appears to have helped the issue we were having as well (using HTTP API).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

7 participants