Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editable schema #174

Open
metasoarous opened this issue Aug 19, 2016 · 9 comments
Open

Editable schema #174

metasoarous opened this issue Aug 19, 2016 · 9 comments

Comments

@metasoarous
Copy link

I have an implementation of this in Datsync (here), and I'd really rather have it just be a function in datascript.core if you're amenable. More or less, it creates a new database from the old datoms and the updated schema.

There's a question about what the should be sent to listeners when this happens. It could perhaps be a more or less empty transaction report with keys :schema-changes, :new-schema and :old-schema, perhaps.

Happy to clean this up and PR if you like.

@tonsky
Copy link
Owner

tonsky commented Aug 24, 2016

Replacing schema is trivial to do in user code. What I don't want to do is to provide schema altering fn that will do that and ignore all the issues that arise from the fact that data stored in DS might not suit new schema (unique values, ref type, arities). If I ever provide such fn, it should deal with all the issues (~the same way Datomic deals with them, checking constraints, raising errors, not allowing certain migrations, etc). I'm not against changing schema, but for now only user of DS library is responsible for all the inconsistencies. That's why I think such a fn should not be included in DS yet (but all the tools to build it yourself are there already)

@metasoarous
Copy link
Author

Fair enough. In my case (datsync), the only schema changes that come through are ones that had already passed through Datomic, so they'd more or less been "vetted". But you're right that there shouldn't be surprises in something that's a core part of datascript, and it would be easy for someone not thinking about the consequences to goof. Here are my thoughts on these issues:

  • ref vs other types: Datomic actually doesn't let you change :db/valueType. So we could scan the datoms and make sure no attributes are being set as reference attributes when they've already been used for other data (and vice versa).
  • uniqueness: Changing from unique to non-unique is pretty straight forward with Datomic, but the other direction is only allowed in certain cases (values are already unique and have an index, or there are no values). We could mimick that or simply disallow depending on how much work we're willing to put into it.
  • cardinality: cardinality one to many is straight forward, but many to one requires deciding which value to keep. I think this may actually be straightforward; Under the new schema I think passing in the old datoms would deduplicate all but one of the values. If I'm wrong about that we could either disallow once there are multiple values, or filter all but the most recently asserted value.

Here are the relevant Datomic docs for reference: http://docs.datomic.com/schema.html#Schema-Alteration.

I get that since it's possible for users to deal with these issues themselves when they need it, this isn't a high priority in your book. However, I think for new users especially, it may not be obvious how might use init-db and swap! to do this themselves, let alone all the issues mentioned above. And I think this puts a damper on iterative/interactive development. With that in mind, if someone came up with an implementation that dealt with all these issues sufficiently, would you be open to including it in DataScript?

@tonsky
Copy link
Owner

tonsky commented Aug 24, 2016

I think schema migration should work one of three possible ways:

  1. Always accept changes if they are safe (one to many, unique to non-unique)
    • adjust internals if needed (index to no index and vice-versa)
  2. Validate changes if they're allowed but can be unsafe (many to one, non-unique to unique)
    • throw if constraint is not satisfied by new schema
  3. Deny impossible changes (value type)

Important here is that DataScript will not change any user data as a result of schema alteration. That way user will have a chance to clean up data in the way that suits them, and DataScript will keep them safe by providing guarantees.

This would be an important piece of DataScript and I'll be happy to include it.

@metasoarous
Copy link
Author

Agreed; Sounds good.

@alexandergunnarson
Copy link

alexandergunnarson commented Jan 19, 2017

@tonsky I have an implementation of what you and @metasoarous describe in posh.sync.schema (mainly in ensure-schema-changes-valid of a Posh PR I'm working on). I'll PR it to DataScript once it's stable. Just thought I'd give you an opportunity to take a look.

@darkleaf
Copy link
Contributor

darkleaf commented Jan 8, 2021

Replacing schema is trivial to do in user code.

@tonsky did you mean the init-db function?

If I want to add new attributes and don't touch already existing ones, then can I use a function like this?

;; just an example
(defn patch-schema [db patch]
  (let [schema     (:schema db)
        new-schema (merge patch schema)
        datoms     (d/datoms db :eavt)]
    (d/init-db datoms new-schema)))

The order of merge arguments is important. Probably it will be better to check conflicts there.

@tonsky
Copy link
Owner

tonsky commented Jan 9, 2021

Seems fine, but merge should be something like merge-with merge and patch applied over schema. The validation will be very tricky, which is the main reason I left it unimplemented. If you know what you are doing is safe, a function like this should be fine

@darkleaf
Copy link
Contributor

darkleaf commented Jan 10, 2021

Updating existed database is very tricky and requires support for new features like tuples.
But what if we just recreate a db from scratch?

(defn with-schema [db new-schema]
  (-> (d/empty-db new-schema)
      (with-meta (meta db))    
      (d/db-with (d/datoms db :eavt))))

(defn vary-schema [db f & args]
  (with-schema db (apply f (d/schema db) args)))               

Validation is done by d/db-with.
This way, we can use reliable but not so fast functions because they recreate indices.

I use a DB as a value. So I can use prototypes:

(def blank-entity
  (-> (d/empty-db {:error/entity {:db/valueType :db.type/ref}})
      (d/db-with [{:db/id    1
                   :db/ident :root}])))

(def blank-user
  (-> blank-entity
      (vary-schema assoc :user/aka {:db/cardinality :db.cardinality/many})))

(def john (-> blank-user
              (d/db-with [{:db/ident  :root
                           :user/name "Maksim"
                           :user/aka  ["Max Otto von Stierlitz", "Jack Ryan"]}])))

Can I send PR with these functions?

@tonsky
Copy link
Owner

tonsky commented Jan 11, 2021

This is what I am thinking:

  • This implementation is very inefficient
  • Can be reproduced in user code trivially if needed

If at any point in the future we might introduce actual schema migrations, they will probably have different API that will allow for more efficient implementation (e.g. by providing schema deltas instead of entirely new schema). So I don’t want to be locked right now into this API

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants