Restaurants dataset

We based our experiments on a total of 1828 restaurants from Yelp.com across 4 cities. The list of restaurants' ids used in our study can be found at paper_data/restaurants_list.txt. For instance, for NX8VYnWFQ2ZY-H0HwvfPTw, the corresponding restaurant is https://www.yelp.com/biz/NX8VYnWFQ2ZY-H0HwvfPTw. For each restaurant, the top 20 reviews and top 20 popular dishes were used for our experiments.

We collected 100 single-turn questions via Prolific. The results of the SUQL system can be found at paper_data/restaurants_single_turn_SUQL.csv. Here is a breakdown of what the columns mean: Unique ID (a unique row identifier), Utterance (the collected user question), Predicted SUQL (the predicted SUQL with gpt-3.5-turbo-0613), Agent response (the SUQL system's response), 1st entity, 2nd entity, 3rd entity (the up-to-three returned entities), Entity Annotation,Whether Wrong Parse (whether this is a wrong SUQL parse), Prolific ID (prolific ID associated with the submission), Structural Unstructural annotation (whether the question requires only structured data or a combination, used for Table 4 in the paper). The false positives from the SUQL system (based on annotation) can be found in paper_data/restaurants_single_turn_SUQL_fp.txt

We also collected 96 turns across 20 conversations. The collected dialog data and its annotation can be accessed here. Here is a breakdown of what the columns mean:

Unique ID: Unique row identifier. Each identifier is of format id,num, where the same id denotes the same conversation, and num denotes the turn number (0-indexed) in this conversation. We excluded turns that do not involve restaurant-related queries (e.g. chit-chatting, asking what the system can do, etc.).
Utterance: the collected user question.
Predicted SUQL: the predicted SUQL with gpt-3.5-turbo-0613). Green denotes a correct parse, and red denotes a wrong parse.
Agent response: the SUQL system's response.
1st/2nd/3rd entity: the entities returned from the SUQL system. Each green cell denotes a true positive. Each red cell denotes a false positive. Alternatively, if a row does not contain returned entities and the row indeed involves searching for restaurants, a drop-down box denoting true or false negative is present. Within each conversation, the same entity is only counted once (the same entity appearing twice would not be colored).
Structural Unstructural annotation: whether the question requires only structured data or a combination, used for Table 4 in the paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paper_results.md

paper_results.md

Restaurants dataset

Files

paper_results.md

Latest commit

History

paper_results.md

File metadata and controls

Restaurants dataset