DataRefine is a Java-based spreadsheet system that allows you to load data, understand it, clean it up, reconcile it, and augment it with data coming from the web all from a web browser.
- Outlier detection facet using nearest-neighbor (NN)-based interquantile range (IQR) for numeric data, e.g. time series, image metadata
- Semantic facet via the inference API of the pre-trained BERT model, e.g. people's name, stock, book title, streets
- Type recommendation results sorting for non-numeric data
- UI Renovation
- Clone this github repo.
git clone https://github.com/Joannechiao18/DataRefine.git
- Install JDK 8, Apache Maven, and Eclipse.
- Import the cloned project into Eclipse. (Remember to uncheck
extensions
andpackaging
on the import window). Run configuration
and set the base directory to${workspace_loc:/openrefine}
; Goals toexec:java
.- Click
Run
, and DataRefine will run at local hosthttp://127.0.0.1:3333/
.
This software was created by Metaweb Technologies, Inc. and originally written and conceived by David Huynh dfhuynh@google.com. Metaweb Technologies, Inc. was acquired by Google, Inc. in July 2010 and the product was renamed Google Refine. In October 2012, it was renamed OpenRefine as it transitioned to a community-supported product.