-
Notifications
You must be signed in to change notification settings - Fork 4
Sample Data for Visualisation Groups #21
Comments
Thanks! You've probably noticed, there are relationships which are expressing the same but in different words. One example is "was born in" and "was born at". Maybe this is already in your feature backlog, but we would also need you to merge those kinds of similar relationships. One simple heuristic that would work in most cases is to compare the entities in the relationships - if those entities are the same across two different relationships, that would mean that most likely the relationships are the same too. Also I'd like to point out, there are quite a few odd bugs in the outputs: Sometimes the quality metric is not a number but some random string, sometimes sentences don't correspond to the extracted relationships, sometimes said sentences are a single letter. You've probably noticed it too, I realize it's a first output, please keep us posted with new versions 👍 |
Yes true, I am currently trying to merge those relationships into their normalized relationships so that should solve the issue of getting the similar type of relationships. <Entity1, Composed a piece for, Entity2> Like these, there could be multiple different relationships between the two entities, so simply merging based upon the entities will be a problem. And yes, thanks for pointing out the other mistakes. Actually the text used was direct copy and paste of Wikipedia text without any per-processing, so that's why there were some sentences of a single word and even could be of some letters only. We are building per-processing part for the text so that should solve it. Hopefully in a day or so you will get another version of the relationships! :) |
@ansjin please supply the sample data till the deadline tomorrow at the very latest - we need this urgently! ⏩ 💨 |
See #27 |
Provided with sample timeline data here MusicConnectionMachine/api#22 |
How is this issue progressing? Do you have any unmet dependencies blocking this work? How much data can you provide to the visualization groups today or tomorrow? |
@vviro Currently as part of algorithm everything is there.
Once the connection part is done we can provide complete data to them! |
@ansjin good! Are there any issues blocking the connection part from working or is it simply a matter of implementing it? |
It's mostly a matter of implementing it. Take a look at our issue tracker in the new Relationships project to see our current progress. |
@kordianbruck Why waiting? Nobody needs sample data anymore..? |
I think this issue is outdated anyways… Closing for now |
Yea, I wasn't sure about it. Thanks for closing it. |
Find below attached, the results of different algorithms in json format(rename from .txt format to .json format). But most of the data is kinda junk as sentences are picked up directly from wiki page.
In the others file there are some useful results which we got after training one of the algorithm on some training data. But in it also there exists false relationships 🤐
This complete data/relationships is about Mozart
We will keep posting here the updated outputs!!! Do tell us if you need anything else.
@MusicConnectionMachine/group-5 @MusicConnectionMachine/group-6
input.txt
ollie.txt
open_ie.txt
other.txt
The text was updated successfully, but these errors were encountered: