-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
has() not working until transaction is committed #3841
Comments
Somewhere in those commits last year, there was one which removed an overlay of uncommitted transactions that we were managing on top of the committed data set. It was causing us performance issues to maintain that. With those changes, has and most indices would only work once the txn gets committed, not before. So, this is expected behavior. |
Coming from relational databases with ACID transactions, I find this behavior odd. When you are in a transaction, reads and writes should be consistent. If I can't see the data I just wrote in the same transaction, I think it can lead to fragile code, especially in large projects with many developpers. Suppose I have a function that checks if a promotional code has been applied to a basket, and applies it if it has never been applied. For some reason, because there are many developpers on the same project, this function is called multiple times in the same transaction: this will lead to the promotional code being applied multiple times. This example is not the best in terms of best practices, but it illustrates the risk of having duplicated data or incoherent state. When you compose multiple functions in a same transaction, there is always a risk that some of the functions may execute database queries to verify preconditions, possibly the same ones, and act accordingly. In my case, every query is at risk, because I filter every query by type (e.g. I don't know if I make sense, what do you think about it? In any case, I would have liked to have been informed of this change in behaviour more clearly before proceeding with the upgrade, because it is not nothing :). When I encountered this problem, I consulted the documentation and the changelog to find out more about the behavior of |
related to #3721 |
Thanks for the link. I read #3721 and #3103, and I'm kind of puzzled by your stance on transactions. I would like to open a debate on this subject and discuss it with any of you, because I think it is important. What you call "an inconvenience caused by the internal design of Dgraph", for someone who has experience with relational databases supporting ACID transactions, I would call it a bug. When I read, for example, "Multiple read - mutate pairs of operations on the same data inside the same transaction can lead to stuff like this" or "For now only do one read and mutate per transaction just to be safe", it is really surprising for me because I didn't have this kind of information when I started using Dgraph, and that explains a whole bunch of strange behaviors I've had to deal with, especially having made my own Object Graph Mapper. The documentation should really emphasize this kind of behaviour, and the practices that result from it, because for someone who comes from the world of SQL, these are not expected behaviours for an ACID-compliant database. Personally, I do a lot of mutations and queries in a same transaction, sometimes on the same data, because the transaction is useful for 2 things:
And that, I can't do it if I can't do all the queries and mutations corresponding to a certain workflow in one and the same transaction. By dividing the workflow into several transactions, this adds to the developer's responsibility to manage data recovery in the event of an error. Badly handled, this can result in incoherent states in the database. I had already been a little surprised by your position when you suggested to users who encounter a TxnTooBig error to divide the workload into several transactions, because this prevents them from benefiting from point #2 mentioned above. This also joins my previous message: when you start to have a complex application, divided into several modules that interact with each other, if you try to perform several actions in a workflow that execute each of the queries and mutations within a single transaction, it is difficult to know who did what, and if I can't rely on the read state of the database to know the state of my transaction at a time T, it can cause design problems. That being said, I understand that Dgraph is intended for heavy use requiring good performance, and that this may influence the design of the underlying application. But in this case, it should be better highlighted, because for someone who uses Dgraph as a day-to-day database for moderate workloads, I don't expect to have to change the way I architecture my application, as I could do with a SQL database for example. I'm sorry if this message seems to be offensive, that's not the goal and English is not my native language. I am interested in the development of Dgraph, I use it in production in a large project, and I want to bring my point of view, in relation to my experience. :) |
@alexandredasilva Dgraph also had performance issues with big transactions. See #3046. I haven't retested so I don't know if performance has been improved since. I also came from a SQL world and agree with most of what @alexandredasilva said. I understand Dgraph is still very young project and have other priorities but I do hope Dgraph team can put more love into Dgraph's transaction system. |
Github issues have been deprecated. |
What version of Dgraph are you using?
Dgraph version : v1.0.16
Commit SHA-1 : 0590ee9
Commit timestamp : 2019-07-11 11:52:54 -0700
Branch : HEAD
Go version : go1.12.5
Have you tried reproducing the issue with latest release?
Yes, v1.1.0-rc1, and the behavior is the same.
What is the hardware spec (RAM, OS)?
Windows 10 Pro, version 1903
Intel Core i7-9750H
16GB of RAM
SSD 512GB
Steps to reproduce the issue (command/config used to run Dgraph).
From a clean install of Dgraph, I create the following schema:
I use the
Profile
predicate to type my nodes.In the same transaction, I try to create a node, then verify it exists.
The above code snippet should return one result.
has()
function should work during a transaction and behave like the others functions (for example,eq
). Currently, this code snippet doesn't work, and I get no result.If I replace
has(Profile)
witheq(Profile, "__type")
, I can find my node and I get the expected result.If I execute the query using
has(Profile)
after the transaction has been committed, I can find my node.After some digging, I found that the last version where
has()
works as I think it should during a transaction is v1.0.9. Starting from v1.0.10, the behavior is different.#2724 was introduced in v1.0.10, it may be related.
The text was updated successfully, but these errors were encountered: