Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attributes with unthawable values won't retract or overwrite #208

Closed
den1k opened this issue May 1, 2023 · 14 comments
Closed

attributes with unthawable values won't retract or overwrite #208

den1k opened this issue May 1, 2023 · 14 comments

Comments

@den1k
Copy link
Contributor

den1k commented May 1, 2023

When an attribute value is unthawable :db/add or :db/retract on the same attribute cease to work. This ends up causing a tricky situation where the entire entity must be retracted using :db/retractEntity. It's fine behavior for the db to not serialize/deserialize or thaw/unthaw everything but to not allow overwriting and retraction of the attribute is a bug IMO.

Repro:

(defonce test-conn (d/create-conn))

(d/transact! test-conn [{:foo 1}])
; =>
{:datoms-transacted 1}

(d/touch (d/entity @test-conn 1))
; =>
{:foo 1, :db/id 1}

(d/transact! test-conn [{:db/id 1
                         :foo   (type [])}])
; =>
{:datoms-transacted 2}

(d/touch (d/entity @test-conn 1))
; =>
{:foo   #:nippy{:unthawable {:type       :serializable,
                             :cause      :quarantined,
                             :class-name "java.lang.Class",
                             :content    #datalevin/bytes"rO0ABXZyAB1jbG9qdXJlLmxhbmcuUGVyc2lzdGVudFZlY3RvcpJrsYGlVq0zAgAFSQADY250SQAFc2hpZnRMAAVfbWV0YXQAHUxjbG9qdXJlL2xhbmcvSVBlcnNpc3RlbnRNYXA7TAAEcm9vdHQAJExjbG9qdXJlL2xhbmcvUGVyc2lzdGVudFZlY3RvciROb2RlO1sABHRhaWx0ABNbTGphdmEvbGFuZy9PYmplY3Q7eHIAHmNsb2p1cmUubGFuZy5BUGVyc2lzdGVudFZlY3RvckDGjt5Zq+ubAgACSQAFX2hhc2hJAAdfaGFzaGVxeHA"}},
 :db/id 1}

;; retract attribute :foo
(d/transact! test-conn [[:db/retract 1 :foo]])
; =>
{:datoms-transacted 1}

;; :foo is not retracted!

(d/touch (d/entity @test-conn 1))
; =>
{:foo   #:nippy{:unthawable {:type       :serializable,
                             :cause      :quarantined,
                             :class-name "java.lang.Class",
                             :content    #datalevin/bytes"rO0ABXZyAB1jbG9qdXJlLmxhbmcuUGVyc2lzdGVudFZlY3RvcpJrsYGlVq0zAgAFSQADY250SQAFc2hpZnRMAAVfbWV0YXQAHUxjbG9qdXJlL2xhbmcvSVBlcnNpc3RlbnRNYXA7TAAEcm9vdHQAJExjbG9qdXJlL2xhbmcvUGVyc2lzdGVudFZlY3RvciROb2RlO1sABHRhaWx0ABNbTGphdmEvbGFuZy9PYmplY3Q7eHIAHmNsb2p1cmUubGFuZy5BUGVyc2lzdGVudFZlY3RvckDGjt5Zq+ubAgACSQAFX2hhc2hJAAdfaGFzaGVxeHA"}},
 :db/id 1}

;; overwriting also does not work

(d/transact! test-conn [[:db/add 1 :foo :bar]])
;=> 
{:datoms-transacted 2}

(d/touch (d/entity @test-conn 1))
;=>
{:foo   #:nippy{:unthawable {:type       :serializable,
                             :cause      :quarantined,
                             :class-name "java.lang.Class",
                             :content    #datalevin/bytes"rO0ABXZyAB1jbG9qdXJlLmxhbmcuUGVyc2lzdGVudFZlY3RvcpJrsYGlVq0zAgAFSQADY250SQAFc2hpZnRMAAVfbWV0YXQAHUxjbG9qdXJlL2xhbmcvSVBlcnNpc3RlbnRNYXA7TAAEcm9vdHQAJExjbG9qdXJlL2xhbmcvUGVyc2lzdGVudFZlY3RvciROb2RlO1sABHRhaWx0ABNbTGphdmEvbGFuZy9PYmplY3Q7eHIAHmNsb2p1cmUubGFuZy5BUGVyc2lzdGVudFZlY3RvckDGjt5Zq+ubAgACSQAFX2hhc2hJAAdfaGFzaGVxeHA"}},
 :db/id 1}
@huahaiy huahaiy added the bug Something isn't working label May 2, 2023
@huahaiy
Copy link
Contributor

huahaiy commented May 4, 2023

If you retract with the value, it works. (d/transact! test-conn [[:db/retract 1 :foo (type [])]])

The methods you tried are difficult to make them work, because the only way to get the exact form of the saved wrong bytes is to use the original data when retracting it.

Added a test to show this. https://github.com/juji-io/datalevin/blob/master/test/datalevin/test/transact.cljc#L800-L814

So this is a wontfix.

@huahaiy huahaiy added wontfix This will not be worked on and removed bug Something isn't working labels May 4, 2023
@huahaiy huahaiy closed this as completed in 674de23 May 4, 2023
@den1k
Copy link
Contributor Author

den1k commented May 4, 2023

good to have a workaround!

the only way to get the exact form of the saved wrong bytes is to use the original data when retracting it

I’m not sure I’m understanding. Why does retraction for other values work providing EA and not V but not in this case?

Since V is queryable using EA and the attribute is not a ref but arbitrary EDN/bytes maybe V could be implicit in the retract to achieve consistent behavior?

@huahaiy
Copy link
Contributor

huahaiy commented May 4, 2023

It is possible, but it is difficult, as there are many issues to consider, such as multi-value, single-value, other data types, and so on. Since it is retractable with the value given, I don't see a need to change the behavior.

@den1k
Copy link
Contributor Author

den1k commented May 4, 2023

I see. (type []) was an example and is easily reproducible. But in a real application where arbitrary values which fail to serialize can not be retracted this remains an issue especially because the following does not work:

(d/transact! test-conn [[:db/retract 1 :foo (:foo (d/entity @test-conn 1))]])
; => {:datoms-transacted 0} ;; should be 1

Removing everything under the attribute also does not work:

(d/transact! test-conn [[:db.fn/retractAttribute 1 :foo]])

I guess the only way around this is to either retract the entire entity or to try to serialize and deserialize the value and not write it to the DB on fail.

@huahaiy
Copy link
Contributor

huahaiy commented May 4, 2023

This all stems from the asymmetry of nippy serialization, allow serialization of unknown but failed to deserialize. If we make it symmetric, i.e. don't allow serialization of unknown and throw exception instead, this problem disappears.

So we can change it such that un-thawable things would not be able to be transacted, would this be a better resolution?

@huahaiy huahaiy removed the wontfix This will not be worked on label May 4, 2023
huahaiy added a commit that referenced this issue May 4, 2023
@den1k
Copy link
Contributor Author

den1k commented May 4, 2023

Yes 100%. There's no point to transact values that can't be read (unthawed). Thanks @huahaiy

@den1k
Copy link
Contributor Author

den1k commented May 4, 2023

just saw the fix. It's not just classes that have this issue:

(d/transact! test-conn [{:foo (defn foo [] :foo)}])
(d/touch (d/entity @test-conn 2))
{:foo #:nippy{:unthawable {:type :serializable,
                           :cause :quarantined,
                           :class-name "clojure.lang.Var",
                           :content #datalevin/bytes"rO0ABXNyABtjbG9qdXJlLmxhbmcuVmFyJFNlcmlhbGl6ZWS2pCcBU/2jLQIAAkwABm5zTmFtZXQAFUxjbG9qdXJlL2xhbmcvU3ltYm9sO0wAA3N5bXEAfgABeHBzcgATY2xvanVyZS5sYW5nLlN5bWJvbBCHbBnxuSwjAgAESQAHX2hhc2hlcUwABV9tZXRhdAAdTGNsb2p1cmUvbGFuZy9JUGVyc2lzdGVudE1hcDtMAARuYW1ldAASTGphdmEvbGFuZy9TdHJpbmc7TAACbnNxAH4ABXhwUE0/xnB0AAt0ZXNzZXJhZS5kYnBzcQB+AAOtak+bcHQAA2Zvb3A"}},
 :db/id 2}

(d/transact! test-conn [[:db/retract 2 :foo]])
=> {:datoms-transacted 1}

(d/touch (d/entity @test-conn 2))
{:foo #:nippy{:unthawable {:type :serializable,
                           :cause :quarantined,
                           :class-name "clojure.lang.Var",
                           :content #datalevin/bytes"rO0ABXNyABtjbG9qdXJlLmxhbmcuVmFyJFNlcmlhbGl6ZWS2pCcBU/2jLQIAAkwABm5zTmFtZXQAFUxjbG9qdXJlL2xhbmcvU3ltYm9sO0wAA3N5bXEAfgABeHBzcgATY2xvanVyZS5sYW5nLlN5bWJvbBCHbBnxuSwjAgAESQAHX2hhc2hlcUwABV9tZXRhdAAdTGNsb2p1cmUvbGFuZy9JUGVyc2lzdGVudE1hcDtMAARuYW1ldAASTGphdmEvbGFuZy9TdHJpbmc7TAACbnNxAH4ABXhwUE0/xnB0AAt0ZXNzZXJhZS5kYnBzcQB+AAOtak+bcHQAA2Zvb3A"}},
 :db/id 2}

(d/transact! test-conn [[:db/retract 2 :foo (:foo (d/entity @test-conn 2))]])
(d/touch (d/entity @test-conn 2))

@den1k
Copy link
Contributor Author

den1k commented May 4, 2023

The issue might be the following in nippy. It freezes everything but thaws only some types.

taoensso.nippy/*freeze-serializable-allowlist*
=> #{"*"}
taoensso.nippy/*thaw-serializable-allowlist*
=>
#{"java.lang.NullPointerException"
  "java.time.YearMonth"
  "clojure.lang.ExceptionInfo"
  "java.time.ZoneOffset"
  "[I"
  "java.time.ZoneId"
  "clojure.lang.ArityException"
  "java.lang.ArithmeticException"
  "java.lang.Throwable"
  "java.net.URI"
  "java.lang.RuntimeException"
  "java.time.Year"
  "[B"
  "java.time.ZonedDateTime"
  "java.time.DateTimeException"
  "[D"
  "java.lang.Exception"
  "[S"
  "java.time.MonthDay"
  "java.lang.IndexOutOfBoundsException"
  "[J"
  "java.lang.IllegalArgumentException"
  "java.time.LocalTime"
  "java.time.OffsetTime"
  "java.time.LocalDate"
  "org.joda.time.DateTime"
  "java.util.UUID"
  "[F"
  "java.time.OffsetDateTime"
  "java.util.Date"
  "java.time.LocalDateTime"
  "[Z"
  "java.time.Clock"
  "[C"}

after running

(alter-var-root
  (var taoensso.nippy/*freeze-serializable-allowlist*)
  (constantly taoensso.nippy/*thaw-serializable-allowlist*))

Now this throws properly:

(d/transact! test-conn [{:bar (defn bar [] :bar)}])
Execution error (ExceptionInfo) at taoensso.nippy/throw-unfreezable (nippy.clj:1005).
Unfreezable type: class clojure.lang.Namespace

While other transactions work fine. I think de7a85d can be rolled back. I will use the above fix in all of my apps. Seems sensible to me to enforce this for the DB (only freeze what can be thawed) since the thawed values are useless and storing them leads to bugs in the DB api.

@huahaiy
Copy link
Contributor

huahaiy commented May 4, 2023

That's what de7a85d is supposed to do. We bind it instead of alter-var-root, because we don't want to disrupt other people's use of nippy.

The current code actually does throw on {:bar (defn bar [] :bar)}.

java.lang.Class is a special case that we have to single it out. Others are fine.

huahaiy added a commit that referenced this issue May 4, 2023
@den1k
Copy link
Contributor Author

den1k commented May 4, 2023

ah okay. Great!

@den1k
Copy link
Contributor Author

den1k commented May 4, 2023

Seems like there would be a performance hit since the binding and the into run each time serialize is invoked.

@den1k
Copy link
Contributor Author

den1k commented May 4, 2023

@huahaiy
Copy link
Contributor

huahaiy commented May 4, 2023

I wouldn't worry about performance hit of dynamic binding for now, there are many other places that have much greater impact.

If one worry about performance, it is better to specify a typed data instead of using an Edn blob.

@den1k
Copy link
Contributor Author

den1k commented May 4, 2023

Got it. It's probably a non issue then. In my case I can't type the data because it's an arbitrary return value of code running in cells in https://github.com/lumberdev/tesserae

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants