Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer between importer and data store #1305

Open
jkiviluo opened this issue Mar 17, 2021 · 12 comments
Open

Transformer between importer and data store #1305

jkiviluo opened this issue Mar 17, 2021 · 12 comments
Labels
Data import/export enhancement Enhancement of existing feature Feature Possible new feature
Milestone

Comments

@jkiviluo
Copy link
Member

jkiviluo commented Mar 17, 2021

Transformers do not support transformations between an importer and Spine data store. That would be nice.

@jkiviluo jkiviluo added the Feature Possible new feature label Mar 17, 2021
soininen added a commit to spine-tools/spine-items that referenced this issue Mar 18, 2021
Data Transformer didn't properly notify successor items if it got a new
database URL from predecessor Data store.

Re spine-tools/Spine-Toolbox#1305
@soininen
Copy link
Contributor

Looks like I broke Data transformer with my latest changes. Should be fixed now --- at least when connected between Data store and Exporter. Making transformer to work between importer and Data store is a completely different story. While waiting for that to be resolved, you can

  1. Import data to a temporary database and connect that database to the actual database via a data transformer
  2. Write a Tool script that does the needed transformations to the source data before feeding it to importer.

@soininen soininen added this to the V1.0 milestone Mar 18, 2021
@jkiviluo
Copy link
Member Author

In my particular case, I can do the transformation between DB and exporter. I will change the issue name to Importer - Data Store.

@jkiviluo jkiviluo changed the title Transformer, exporter and importer Transformer between importer and DB Mar 18, 2021
@jkiviluo jkiviluo changed the title Transformer between importer and DB Transformer between importer and data store Mar 18, 2021
@soininen
Copy link
Contributor

Thanks for updating the title and description. The actual feature request here is now much clearer.

There are two ways I can think this could be done:

  1. Apply transformation at import time in import_mappings. How feasible this is and how much it would complicate the API need to be investigated
  2. Apply transformations after import inplace, i.e. import data as-is, then transform the data within the database. This would be nice in that we could do these transformations to any existing database at any time. Problems might arise with name clashes at import, though. No idea if this is even feasible.

@soininen
Copy link
Contributor

Inplace transformation is actually already doable: just connect two data stores pointing to the same database via a Transformer. Case solved.

@jkiviluo
Copy link
Member Author

I wouldn't put a high priority to this. Quite ok functionality can be achieved by having a transformer between two data stores, which is supported.

@jkiviluo
Copy link
Member Author

jkiviluo commented Mar 18, 2021

And your solution is even nicer. Although how does it play with DAG order?

@soininen
Copy link
Contributor

(although how does it play with DAG order)?

You have two data stores using the same database. That plays very well with the DAG.

@jkiviluo
Copy link
Member Author

Ok, right. I thought you meant that there would be a small loop from DS to transformer and back to DS.

@manuelma
Copy link
Collaborator

How about DT advertises an in-memory database backwards?

Importer -> DT -> DS

Importer would import data into the in-memory db, DT would apply the 'transformation filter' on that db, and DS would merge that db into it's own physical db.

That could work if in-memory dbs were shareable by URL, but they are only shared by 'connection instance'...

@soininen
Copy link
Contributor

That could work if in-memory dbs were shareable by URL, but they are only shared by 'connection instance'...

Indeed, makes them unusable in many scenarios unfortunately.

@manuelma
Copy link
Collaborator

I don't know, there might be a way... The double DS pointing to the same URL solution is good, but might be a little bit too clever, don't you think?

On the other hand, Importer -> DT -> DS seems logical. It's only an implementation detail from our part that prevents it to work, right? (that we only share stuff by url)

@soininen
Copy link
Contributor

It's only an implementation detail from our part that prevents it to work, right? (that we only share stuff by url)

Right. We could make Importer -> DT -> DS work with URLs for example if DT passed Importer the DS's URL with some clever write-to-temporary-alternative filter. Importer would then write to that alternative. When DT's execution came it would transform the data from the special alternative inplace.

@nnhjy nnhjy added the enhancement Enhancement of existing feature label Sep 14, 2021
@soininen soininen removed their assignment May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data import/export enhancement Enhancement of existing feature Feature Possible new feature
Projects
None yet
Development

No branches or pull requests

4 participants