Skip to content
This repository has been archived by the owner on Jul 25, 2018. It is now read-only.

Events #22

Closed
chaoran-chen opened this issue Mar 20, 2017 · 27 comments
Closed

Events #22

chaoran-chen opened this issue Mar 20, 2017 · 27 comments

Comments

@chaoran-chen
Copy link
Member

chaoran-chen commented Mar 20, 2017

For the timeline component, that we have recently created, we would like to be able to fetch event data of an artist.

GET /artists/:artist_id/events
Output:

200 --> Arrary.<Event>

where Event is defined as following:

/**
 * @typedef {Object} Event
 * @property {string} date - the date of an event
 * @property {string} title - the title should be short enough to be displayed in the small box inside of the timeline
 * [@property {string} [description] - a longer description that will be displayed in a tooltip] - not available for the moment -
 * [@property {string} [icon] - the URL to an icon] - can not be provided by the API -
 * @property {string} [link] - the URL to a page that provides more information
 */

If other attributes such as category can be provided, we can certainly also use it. Some ideas for events are listed here.

@chaoran-chen
Copy link
Member Author

@MusicConnectionMachine/group-3, @MusicConnectionMachine/group-4 Please assign this issue to one of you and tell us until when you can provide the API. Thanks!

@Sandr0x00
Copy link

Since this involves @MusicConnectionMachine/group-1 and @MusicConnectionMachine/group-2 too, I added them to this conversation.
If I recall correctly, @MusicConnectionMachine/group-3 and @MusicConnectionMachine/group-4 will only provide the data for title and start, link (that's where we get our data from, initially given by G1/G2).
We can't provide anything for:
description: we have only relationship-triplets, no description,
icon: we work on text, not pictures,
linkType: Please explain that in more detail @chaoran-chen.

@chaoran-chen
Copy link
Member Author

There are two possible values for linkType:

  • "internal": The link points to a site of IMSLP, it can be used for events such as "First piece published", then the event should contain a link to the IMSLP page of that piece.
  • "external": The link points to another external site that contains more information about the event.

Would it make sense to use the complete sentence or paragraph that is the source of the relationship as the description?

@ansjin
Copy link
Member

ansjin commented Mar 27, 2017

As far as for the timeline component we can only provide the data in the form of triplets only:

E1: Subject
E2: Date for the event
Relationship: Relationship/ A small description which is mentioned as part of the sentence.

If you want in the other way like:

Entity: E1
Date: Some Date
Description: ...

Date: Some other Date
Description: ...

And continuing  like this....

Then its a complete different approach from extracting relationships from the text. I can't guarantee about whether we could provide data like this or not. For us, this will be like extracting events from the text not the relationships 🤐

And regarding the other things:

  • start: We can not tell you whether its a start date or end date. For example
    Entity 1 was born on 5th March1786.
    Entity 1 died in1830
    Here, it will always be a single date not the intervals, so it would be better to have a date not the start as parameter.

  • title, description: For us title and description will be the same thing(relationship). Its up to you how you want to use and display them.

  • icon- how can we provide this? Its totally unrelated.

  • link - The link of the URL from where we extract relationships can be referenced by us in our relationship table. So you can get the link from there.

  • [linkType='internal'] - We can't provide this information. We will reference to the URL table from there you can ask group1 or group2 to store this as an extra parameter.

@Sandr0x00
Copy link

As @ansjin said for his Group, G4 also has no information about IMSLP. We can only give you the URL from G1/2.
To clarify (since I wrote it a bit too shallow): We also do not know start or end, we just know "event | happened at | time"

@chaoran-chen
Copy link
Member Author

Thanks for your answers, I've changed the initial post accordingly.

Can you provide any information about the type of an event? E.g., personal (birth, death, marriage, illness), "career" (composed something, ...), other people speaking about him/her, ... ?

@Sandr0x00
Copy link

Sorry, can't answer that yet since the extraction of dates and times just came up last week and we (G4) just started with it. This Issue was set to low prio in our project (MusicConnectionMachine/RelationshipsG4#47)
I will inform you, when we continue working on that.

@ansjin
Copy link
Member

ansjin commented Mar 29, 2017

@chaoran-chen I was able to extract the attached data for the timeline. I think the data looks good for creation of timeline and it is in the order events happened in entity's life.

You will find the attached input and output JSON files for Bach and Mozart.

It's in this format :


{
        "start": "1763",
        "end": "1766",
        "event": "In the years 1763 - 1766, Mozart, along with his father Leopold, a composer and musician, and sister Nannerl, also a musically talented child, toured London, Paris, and other parts of Europe, giving many successful concerts and performing before royalty."
    },
    {
        "start": "November 1766",
        "event": "The Mozart family returned to Salzburg in November 1766."
    },

Note:

  • For some objects the end date might not exist as there are no intervals mentioned.

  • Also my current logic goes to two level dates only, if there are more than 2 dates in a single sentence it won't detect.

Just check this data and see if we can finalize and prepare API for it.

bach_output.txt
mozart.txt
mozart_output.txt
bach.txt

@chaoran-chen
Copy link
Member Author

chaoran-chen commented Mar 29, 2017

Thanks, @ansjin, it looks great! I've three further questions:

  • Would it be possible to also pass an easier processable version of a data (e.g., {day: undefined, month: 1, year: 1762} instead of "January, 1762")? (If it is too much of work and you don't have time right now, I can also do it in the front-end.)
  • I guess that it is not a problem to provide a link to the source - right?
  • Can you provide any information about the type of an event as I have asked three posts ago?

@ansjin
Copy link
Member

ansjin commented Mar 29, 2017

Thanks!

  • I have to check this, may be I will add it at the later point if that's not a problem with you ?

  • Source of the link will be added when we will create API. As for extraction of this relationship I will be using some URL or some text linked to URL so from there it can be easily referenced.

  • I think regarding event type it will take time. As I have to parse text and extract the meaning out from it, so I can't assure you on this.

@ansjin ansjin added the Feature label Mar 29, 2017
@chaoran-chen
Copy link
Member Author

Okay, that's fine! Thanks for your effort, I'm looking forward to see the API online :)

@chaoran-chen
Copy link
Member Author

Hey. Might I ask if you've already started with this task? When can we expect to have it ready?

@chaoran-chen
Copy link
Member Author

chaoran-chen commented Apr 3, 2017

Please give us an estimate when it can be implemented. If you can't finish it in the next few days, maybe you could provide a real and roughly complete dataset for Mozart (or Bach) so that we can get a feeling how many events there will be and how long the texts are.

@ansjin
Copy link
Member

ansjin commented Apr 5, 2017

@chaoran-chen and @vviro

Currently we take WET file given to us by group2(Later on get from the DB) -> Pass to algorithms -> Get the output Events/Relationship - > Store in the DB (currently Local DB)

But the problem is either we don't have the data to run our algorithms upon or the data is too bad that our algorithms don't give the meaningful results out. See here MusicConnectionMachine/Relationships#27

If the @MusicConnectionMachine/group-2 can provide us some meaning full data( like the scrapped Wikipedia page) then only we will give you the results.

@chaoran-chen
Copy link
Member Author

As I have seen, you already received a set of scrapped of Wikipedia pages. Are they suitable to extract event data?

And one more thing: please provide a unique id to every event.

@Sandr0x00
Copy link

Sandr0x00 commented Apr 7, 2017

The data has a unique id from the db. So we can ✔️ on that.
We ran our algorithm over the data and got some data:
currently the data is just on the local dbs and is saved in the db like that:
id: b85fbef0-2cf5-4a43-a437-65e6058ee2ce
start: 1100
end:
event: Monophonic chant, also called plainsong or Gregorian chant, was the dominant form until about 1100.[36] Polyphonic (multi-voiced) music developed from monophonic chant throughout the late Middle Ages and into the Renaissance, including the more complex voicings of motets.

I can give you a CSV-file from my local db, with data extracted from the wiki-file, but until now we have no data in the db on Azure. Don't know the status of the API (to give you the real request-response stuff) to be honest. Maybe @Henni has some overview here.
Edit: The api seems to be live on Azure: #88 (comment)

So, at the moment do not expect too much of our date-extraction, because still, this Date-Extraction is only a side-job from our relationship-stuff, therefore still low priority, like @kordianbruck added here (MusicConnectionMachine/RelationshipsG4#47 (comment))

On the other side, the code for the extraction is written.

We only have no way of linking an event to a certain person at the moment because 1: we have no data from @MusicConnectionMachine/group-1 in our db, and 2: we have no link from the data of G1 to the blob of G2, and 3: we don't know exactly for which person/musicpiece/instrument/animal/thing the event is, we only can assume it's for the current entity we process, and that can lead to:
start: 27 January 1756
event: Wolfgang Amadeus Mozart was born on 27 January 1756 to Leopold Mozart
to be linked to Beethoven, if Beethoven is the entity we are currently processing, because we can't link Wolfgang Amadeus Mozart to the WAM already in the DB from G1, because we simply do not know it is the same person.

Maybe @MusicConnectionMachine/group-1 has some "events" from the structured sources.

@kordianbruck
Copy link
Contributor

@SANDR00 update on this?

@sacdallago
Copy link
Member

Super needed

@Sandr0x00
Copy link

Sandr0x00 commented Apr 18, 2017

We still have no connection between G1s entities and our entities, because it was low prio before yesterday.
Since this is now HP, we will start with that. I do not have much time at the moment, but maybe another one of @MusicConnectionMachine/group-3 or @MusicConnectionMachine/group-4 can do that. Otherwise I can still do it in the Hackathon, but maybe this will be too late.

@sacdallago
Copy link
Member

Hackathon is too late, unfortunately, because if things go wrong the time won't suffice to fix them and still run everything, considering it's a saturday and we are supposed to release on sunday. So please someone else can try this out before that, aka tonight, tomorrow or Thursday?

@Sandr0x00
Copy link

@ansjin said in the chat, that he has a bit more time from 20.4.
Maybe he's your man 😉

@ansjin
Copy link
Member

ansjin commented Apr 18, 2017

As @SANDR00 mentioned, yes I will be free from 20th so I will look into this after that!

@chaoran-chen
Copy link
Member Author

What's the current status here? I've just taken a look into the events table: there is still no entityId.

@ansjin
Copy link
Member

ansjin commented May 1, 2017

Please check the DB, there is already an entityId associated with events.

This issue can be closed now!

@ansjin ansjin closed this as completed May 1, 2017
@chaoran-chen
Copy link
Member Author

Thank you very much, @ansjin! I found entityId but it's only in mcmprod and not in mcm. Do you know if the API is already using mcmprod? (or maybe @sacdallago, @kordianbruck?)

@ansjin
Copy link
Member

ansjin commented May 2, 2017

@chaoran-chen Your welcome, I am not sure about the API. Also currently there are around 25K relations and 125K events already stored in the DB, maybe you can try to use this data and give us feedback on it!

@Henni
Copy link
Contributor

Henni commented May 2, 2017

@chaoran-chen I just switched the API to mcmprod. But there are still a few schema validation errors, thanks to how swagger handles null values.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants