Skip to content

BESDUI: Benchmark for End-user Structured Data User Interfaces

License

Notifications You must be signed in to change notification settings

rhizomik/BESDUI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BESDUI

The Benchmark for End-User Structured Data User Interfaces (BESDUI) is based on the Berlin SPARQL Benchmark (BSBM) but intended for benchmarking the user experience while exploring a structured dataset, not the performance of the query engine. BSBM is just used to provide the data to be explored.

BESDUI is a cheap User Interface benchmark as it does not involve users but experts who measure how many interaction steps are required to complete each benchmark task. It also facilitates comparing different tools without the bias that different end-user profiles might introduce. The way to measure these interaction steps and convert them to an estimate of the required time to complete a task is based on the Keystroke-Level Model (KLM)

Reference

García, Roberto; Gil, Rosa María; Bakke, Eirik; Karger, David R. (2020). A benchmark for end-user structured data exploration and search user interfaces. Journal of Web Semantics, 2020, vol. 65, p. 100610. DOI: 10.1016/j.websem.2020.100610. Preprint: 10459.1/69484

Benchmark Tasks

Task 1. Find products for a given set of features combined

Task 2. Find products for a given set of alternative features

Task 3. Retrieve basic information about a specific product for display purposes

Task 4. Find products having some specific features and not having one feature

Task 5. Find products matching two different sets of features

Task 6. Find products that are similar to a given product

Task 7. Find products having a name that contains some text

Task 8. Retrieve in-depth information about a specific product including offers and reviews

Task 9. Give me recent reviews in English for a specific product

Task 10. Get information about a reviewer

Task 11. Get offers for a given product that fulfill specific requirements

Task 12. Export the chosen offer into another information system which uses a different schema

Metrics

Based on the benchmark, three Quality in Use metrics have been proposed, one about effectiveness and two about efficiency:

  • Capability (C) (effectiveness): what proportion of one task is completed (0% if not possible to complete or 100% otherwise) or, for the whole benchmark, the percentage of all 12 tasks completed.
  • Operator Count (OC) (efficiency): the number of KLM Operators, from Table 1, required to complete a task or the average count just for completed tasks.
  • Time (T) (efficiency): each KLM Operator has a corresponding average completion time, as detailed in Table 1. For a task, this metric is computed by multiplying the time for each operator by the operator count for each type. Then, summing them all together. For the whole benchmark, it is the average time considering just the completed tasks.

Additionally, there is a combined effectiveness/efficiency metric:

  • Task Efficiency (TE) (effectiveness/efficiency): measured as the ratio of Capability to Time, "goals per minute". For the whole benchmark, it is computed using the percentage of all 12 tasks completed divided by the average time for the completed tasks, then multiplied by 60 to compute the goals per minute.
Keystroke-Level Model (KLM) Operator Time (seconds)
K: button press or keystroke, keys and not characters. 0.2
P: pointing to a target on a display with a mouse. Time differs depending on distance and size of the target, but is held constant. 1.1
H: homing the hand(s) on the keyboard or other device, this includes movement between any two devices. 0.4

Table 1. Interaction steps used to complete a benchmark task and their corresponding average time to be completed

Results

Currently, the BESDUI has been applied to the following tools:

  • Rhizomer: the old version of RhizomerEye, a semantic data exploration tool.
  • RhizomerEye: the current version of Rhizomer, a semantic data exploration tool.
  • Virtuoso FCT: the faceted browser for the Virtuoso RDF data store.
  • Sieuferd: a general-purpose user interface for relational databases.
  • PepeSearch: a search interface for querying SPARQL endpoints.
Averages per Tool Capability K (0.2s) P (1.1s) H (0.4s) Operator Count Time Task Efficiency
RHIZOMER 58% 16.4 11 2 29.4 16.2 2.2
RHIZOMEREYE 58% 13 8.3 1.7 23 12.4 2.8
VIRTUOSO FCT 46% 23.5 14.5 3.5 41.5 22.1 1.2
SIEUFERD 96% 48.7 19.8 2.9 71.3 32.63 1.8
PEPESEARCH 17% 7 2.5 1.5 11 4.8 2.1

Table 2. Benchmark results for different End-User Structured Data User Interfaces. Showing the average for all completed benchmark tasks. The best results are in bold

Contributing

BESDUI is available under a Creative Commons Attribution-ShareAlike 4.0 International LICENSE. To contribute to the Benchmark for End-User Structured Data User Interfaces (BESDUI), FORK the BESDUI GitHub repository and, when your evaluation is completed, perform a PULL REQUEST to propose your changes. Alternatively, you can download all the required data to prepare your contribution from http://hdl.handle.net/10459.1/69484

The data to be loaded to perform the evaluation is available from the Datasets folder. The bsbm-1000products.ttl.tgz file contains the RDF version of the test data.

To apply the benchmark, perform each one of the tasks with the evaluated tool and check if the expected results are obtained. If they are not, or it is not possible to complete the task, then report a 0% for the Capability metric for that task. Otherwise, the Capability is 100%. In this case, also report how many KLM operators are required to complete the task. The operators are the basic user interactions:

  • K: button press or keystroke, keys and not characters.
  • P: pointing to a target on a display with a mouse. Time differs depending on distance and size of the target, but KVM simplifies this and makes it constant.
  • H: homing the hand(s) on the keyboard or other device, this includes movement between any two devices.

When all tasks have been checked, you can compute the BESDUI metrics using the BESDUI Spreadsheet.

To contribute results, please, use the Evaluation Template. Create a folder in Results named after the evaluated tool and copy the EvaluationTemplate.md there. Then, rename it to README.md and fill it in to report the results. You can also place all other relevant files (images, spreadsheets, etc.) in the same folder.

Finally, commit all changes to your fork and perform a PULL REQUEST to get your results published. Alternatively, you can send the file with your results to roberto.garcia@udl.cat

Acknowledgements

This project has been partially supported by the research project InDAGuS, Infrastructures for Sustainable Open Government Data with Geospatial Features (Spanish Government TIN2012-37826-C02), together with the Universitat de Lleida and the Massachusetts Institute of Technology.

Universitat de Lleida   Massachusetts Institute of Technology - Computer Science and Artificial Intelligence Laboratory

About

BESDUI: Benchmark for End-user Structured Data User Interfaces

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published