-
Notifications
You must be signed in to change notification settings - Fork 0
cswatt/advanced-db
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
[001] List of Files [002] How to Run [003] High-Level Description [004] Freebase API Key ================================== [001] LIST OF FILES ================================== Retriever.java [creator package] Creator.java EntityBoxCreator.java QueryBoxCreator.java [entities package] Actor.java Author.java BusinessPerson.java Coach.java Film.java League.java Organization.java Person.java Player.java Spouse.java Team.java [output package] Output.java EntityBox.java QueryBox.java [result package] JSONable.java MQLResult.java Result.java Role.java SearchResult.java TopicResult.java Type.java Value.java Values.java [service package] Service.java SearchService.java TopicService.java MQLService.java ================================== [003] HIGH LEVEL DESCRIPTION ================================== The main method in Retriever prompts the user for input. [INFOBOX QUERY INPUT] If the user enters an infobox query, an EntityBoxCreator object will be instantiated. EntityBoxCreator creates a SearchService object using the provided API key and the query. The requestInfo() method in SearchService is then called. This will retrieve a maximum of 20 results using the Freebase Search API, from which TopicResult objects will be created based on the retrieved mIds, using the TopicService methods, in order to see what types of entities each mId matches to. We construct TopicResult objects for each of these search results to see all of the entities they have performed a role as. It would be more lightweight to simply check the "notable" field that is easily accessed through the Search API, but this would only show people whose most notable role is as a businessperson, actor, or author. Using the TopicResult objects, we are able to more comprehensively see all the entities a person has. Each SearchService object has a set of the entities the program is looking for, which can be PERSON, BUSINESSPERSON, ACTOR, AUTHOR, LEAGUE or TEAM. The algorithm first searches the first Topic Results and if none of those types are contained in them then the next five Topic Results are checked, until a total of 20 Topic Results are examined. If all 20 Topic Results, which are created based on the 20 mIds returned by Freebase Search API, don't contain any of those types then an error message is printed and the program exits. Otherwise, the first mId that corresponds to at least one type contained in the above set is chosen, and a SearchResult object is created. SearchResult contains entity's name, mId, id and notable_id, although essentially only mId is used by the program. In the next step, EntityBoxCreator fetches the mId and creates a TopicService object using the API key and the mId. The requestInfo() method in TopicService is then called, which calls the Freebase Topic API using the given mId and creates a Topic Result object by parsing the json document that is returned by the API. The mapping of the values of the json document happens inside the Topic Result instance, where the different types associated with the topic are identified. Then, given the identified types the respective mapping methods are called that parse the json document in search of the required fields that are necessary to form the output. Specifically, the values from the json documents mapped to the different entities in the program are the following: --- PERSON ENTITY [/type/object/name] --> name of the person: a person can only have one name, therefore one value object for that field exists inside the json document. We use the text field of the value object to retrieve the name of the person. [/people/person/date_of_birth] --> date of birth of the person: a person can only have one date of birth, therefore one value object for that field exists inside the json document. We use the text field of the value object to retrieve the date of birth of the person. [/people/person/place_of_birth] --> place of birth of the person: a person can only have one place of birth, therefore one value object for that field exists inside the json document. We use the text field of the value object to retrieve the place of birth of the person. [/people/deceased_person/date_of_death] --> date of death of the person: a person can only have one date of death, therefore one value object for that field exists inside the json document. We use the text field of the value object to retrieve the date of death of the person. [/people/deceased_person/place_of_death] --> place of death of the person: a person can only have one place of death, therefore one value object for that field exists inside the json document. We use the text field of the value object to retrieve the place of death of the person. [/common/topic/description] --> person's description: a person can only have one description about themselves, therefore one value object for that field exists inside the json document. Because the text field of the value object contains a cut version of the description field, we use the value field to retrieve the whole piece of information. [/people/deceased_person/cause_of_death] --> causes of death of the person: a person can have multiple causes of death, or have their cause of death described in various ways. Therefore, more than one value object might exist for that field inside the json document. To map the different value object, we use the Values class which creates instances of Value class. Both classes implement the JSONable interface that uses the Google-Gson library to map annotated fields inside a json documents to specific fields in the class. Value class contains the following fields, which are mapped with the help of the Google-Gson library: text (String), value (String), id (String), timestamp (String), property (JSONObject). The property field might contain more information structured in value objects. Eventually, we use the text field of each mapped value object to retrieve the various causes of death for the person. [/people/person/sibling_s] --> siblings of the person: a person can have more than one sibling. Therefore similar to the 'cause of death' field we map the different value objects to instances of value class. Because more information than the one we want is included in the json document we need to do further processing compared to the previous cases. The required information resides within the property field of each value instance and can be retrieved by mapping the additional field [/people/sibling_relationship/sibling] and ultimately use the text field from the returned value object to retrieve the name of the sibling. Since a sibling can have only one name, one value object for that field exists inside the json document. [/people/person/spouse_s] --> spouses of the person: a person can have more than one spouse over the years of their life. Because we require additional information related to the spouse, such as the period of time the spouse was married to the person in question, and their marriage location, we choose to create Spouse instances for each spouse that contain all this information. Therefore, similar to the siblings field we map the value objects to values instances and after parse the property json object for the following values: [/people/marriage/spouse], [/people/marriage/from], [/people/marriage/to], [/people/marriage/location_of_ceremony] to retrieve spouse's name, period of marriage, and marriage location respectively. Because all of those values exist only once for each spouse of the perso, one value object for that field exists inside the json document. Eventually, we use the text field to retrieve the required information. ---BUSINESS PERSON ENTITY A business person might be associated with an organization in various ways (be the founder/ leader/ board member). Once a new organization is retrieved (a new name) from the json document, a new instance of the organization class is created and the additional information that is related to it is added to the instance's fields. In case the organization has already been detected because the person is associated with the particular organization in more than one ways, the instance is found and the additional information is added to it. [/organization/organization_founder/organizations_founded] --> organizations founded by the person: a person can be the founder of more than one organization. We are only interested in the name of those organizations so after we map the json section of the document to the Values instance we parse each value object and retrieve the text field for each object to get the name. [/business/board_member/leader_of] --> organizations the person is a leader of: a person can be the leader of more than one organizations. We are interested in organization's name, the person's title, role and period of leadership. Therefore, after mapping the values object to value instances, we parse their property field to look for [/organization/leadership/organization],[/organization/leadership/title], [/organization/leadership/role], [/organization/leadership/from], [/organization/leadership/to] respectively. We use the text fields of those value objects, parsed from the property JSONObject to retrieve the required information. [/business/board_member/organization_board_memberships] --> organizations the person is a board member of: a person can be a borad member of more than one organizations. We are interested in organization's name, the person's title, role and period of membership. Therefore, after mapping the values object to value instances, we parse their property field to look for [/organization/organization_board_membership/organization],[/organization/organization_board_membership/title], [/organization/organization_board_membership/role], [/organization/organization_board_membership/from], [/organization/organization_board_membership/to] respectively. We use the text fields of those value objects, parsed from the property JSONObject to retrieve the required information. ---AUTHOR ENTITY [/book/author/works_written] --> list of books written by the author: an author might have written more than one books. We map the value objects to value instances and retrieve the name of each book by using their text field. [/book/book_subject/works] --> list of books about the author: an author might have more than one book written about him. We map the value objects to value instances and retrieve the name of each book by using their text field. [/influence/influence_node/influenced] --> list of peoples' names the author has influenced. We map the value objects to value instances and retrieve the name of each person by using their text field. [/influence/influence_node/influenced_by] --> list of peoples' names the author was influenced by. We map the value objects to value instances and retrieve the name of each person by using their text field. ---ACTOR ENTITY A person is an actor if they have appeared in a movie or a tv show. In case, they have appeared in both movies and tv shows we map both types of information. We create a Film class that contains all the required information related to the character the actor played in the movie or the tv show, as well as the name of the movie and the tv show. [/film/actor/film] --> list of movies the person has appeared in. We map the value objects to value instances and to retrieve the additional information about the actor's character in the movie and the name of the movie we have to parse the property JSONObject for the fields [/film/performance/character] and [/film/performance/film] respectively. We use the text field to retrieve the required information. [/tv/tv_actor/starring_roles] --> list of tv shows the person has appeared in. We map the value objects to value instances and to retrieve the additional information about the actor's character in the tv show and the name of the tv show we have to parse the property JSONObject for the fields [/tv/regular_tv_appearance/character] and [/tv/regular_tv_appearance/series] respectively. We use the text field to retrieve the required information. ---LEAGUE ENTITY [/type/object/name] --> name of the league: a league can only have one name, therefore one value object for that field exists inside the json document. We use the text field of the value object to retrieve the name of the league. [/sports/sports_league/sport] --> sport associated with the league: One value object for that field exists inside the json document. We use the text field of the value object to retrieve the sport associated with the league. [/sports/sports_league/championship] --> championship associated with the league: One value object for that field exists inside the json document. We use the text field of the value object to retrieve the championship associated with the league. [/organization/organization/slogan] --> league's slogan: One value object for that field exists inside the json document. We use the text field of the value object to retrieve the league's slogan. [/common/topic/official_website] --> league's website: One value object for that field exists inside the json document. We use the text field of the value object to retrieve the league's website. [/common/topic/description] --> league's description: One value object for that field exists inside the json document. We use the value field of the value object to retrieve the league's description. [/sports/sports_league/teams] --> teams that participate in the league: We map the value objects to value instances and to retrieve each team's name by parsing the property JSONObject for the field [/sports/sports_league_participation/team]. We use the text field to retrieve the required information. ---TEAM ENTITY [/type/object/name] --> name of the team: a team can only have one name, therefore one value object for that field exists inside the json document. We use the text field of the value object to retrieve the name of the team. [/sports/sports_team/sport] --> sport associated with the team: One value object for that field exists inside the json document. We use the text field of the value object to retrieve the sport associated with the team. [/common/topic/description] --> teams's description: One value object for that field exists inside the json document. We use the value field of the value object to retrieve the teams's description. [/sports/sports_team/arena_stadium] --> teams's arena: One value object for that field exists inside the json document. We use the value field of the value object to retrieve the teams's arena. [/sports/sports_team/founded] --> the year when the team was founded in: One value object for that field exists inside the json document. We use the value field of the value object to retrieve the year the team was founded. [/sports/sports_team/location] --> the locations the team is associated with:We map the value objects to value instances and to retrieve each team's location using the text field of each value object. [/sports/sports_team/championships] --> the championships the team has won:We map the value objects to value instances and to retrieve each team's championship using the text field of each value object. [/sports/sports_team/coaches] --> the coaches of the team: We map the value objects to value instances and to retrieve information about each coach by parsing the property JSONObject for the fields [/sports/sports_team_coach_tenure/coach], [/sports/sports_team_coach_tenure/from], [/sports/sports_team_coach_tenure/to], [/sports/sports_team_coach_tenure/position] for coach' name, period of coaching the team and position respectively. We use the text field to retrieve the required information. [/sports/sports_team/league] --> the leagues the team participates: We map the value objects to value instances and to retrieve information about each coach by parsing the property JSONObject for the field [/sports/sports_team/league] for each league's name. We use the text field to retrieve the required information. [/sports/sports_team/roster] --> the players of the team: We map the value objects to value instances and to retrieve information about each coach by parsing the property JSONObject for the fields [/sports/sports_team_roster/player], [/sports/sports_team_roster/from], [/sports/sports_team_roster/to], [/sports/sports_team_roster/position], [/sports/sports_team_roster/number] for player's name, period of playing in the team and position and number respectively. We use the text field to retrieve the required information. For each value that is mapped by the algorithm, we first check if it exists in the document to avoid throwing exceptions. In case a crucial value is omitted in the document, such as the name of an organization, then the respective field is not considered at all in the output. With respect to the date field that corresponds to "to - date", if we don't find it inside the json document we assume that the relation continues to present so we substitute the to-date field with the value "now". Finally, EntityBoxCreator creates an EntityBox object given the TopicResult that is created by parsing the retrieved json document, and calls print() method to output the desired information in a formatted view to the console. [QUESTION INPUT] If the user enters a question in the form "Who created [X]?" a QueryBoxCreator object will be instantiated. QueryBoxCreator creates an MQLService object using the provided API key and the processed query (the X from the question). The requestInfo() method in MQLService is then called. MQLService makes two requests to the API: first, it assumes that X is a book title and will search for authors who wrote books with X in the title; next, it assumes that X is an organization name and will search for businesspeople who founded organizations with named X. For books, we search for [{"/book/author/works_written": [{"a:name": null, "name~=": X}], "id": null, "name": null, "type": "/book/author"}]" For organizations, we search for [{"/organization/organization_founder/organizations_founded": [{"a:name": null, "name~=": X}], "id": null, "name": null, "type": "/organization/organization_founder"}]" Each of these queries returns a JSONArray response. MQLService then creates an MQLResult object, passing in the two JSONArray responses as parameters. MQLResult parses the two JSONArrays. For each object in each array, the name of the person and the name of the book/organiation are extracted. We simply look for the person's name, designated with $.name, and either $./book/author/works_written[*].a:name $./organization/organization_founder/organizations_founded[*].a:name depending on which JSONArray we are currently parsing. A Role object is created, which holds the person's name and the job through which they created their book/organization. A Map inside MQLResult instance is populated with each pairing of role and creation. QueryBoxCreator creates a QueryBox object with MQLResult as a parameter. QueryBox gets the mql_map from MQLResult.getMQLMap() method, and puts the keys into a List which is then sorted in alphabetical order by the person's first name. The items from this list and their associated values in the map are then printed to console.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published