H2O Flow is a web-based interactive computational environment where you can combine code execution, text, mathematics, plots and rich media to build machine learning workflows.
Think of Flow as a hybrid GUI + REPL + storytelling environment for exploratory data analysis and machine learning, with async, re-scriptable record/replay capabilities. Flow sandboxes and evals user-Javascript in the browser via static analysis and tree-rewriting. Flow is written in non-standard Javascript (with compile-time unqualified imports), with a veritable heap of little embedded DSLs for reactive dataflow programming, markup generation, lazy evaluation and multicast signals/slots.
there is a nice user guide for H2O Flow housed over in the h2o-3 repo
It is recommended that you clone h2o-3 and h2o-flow in the same parent directory.
If you develop for Flow from a Java IDE like IntelliJ IDEA or Eclipse, you can see your changes to Flow in the browser immediately after you run the make
command, without waiting to build a new H2O binary and restart H2O.
If you have not already, follow these instructions to set up your preferred IDE environment for h2o-3 development.
- First, clean up all built files:
cd h2o-3 && ./gradlew clean
- Open up h2o-3 in IDEA, build and launch
H2OApp
. - Run
cd h2o-flow && make install
. You can now access and debug Flow at http://localhost:54321/ - After each change to h2o-flow sources, run the command
cd h2o-flow && make
to push your changes to the running instance of h2o-3.
The task npm run headless
requires installing Phantom JS.
Note: Phantom JS refuses to run on OSX Yosemite, and requires this fix:
brew install upx
upx -d bin/phantomjs
Flow can also be used with Sparkling Water
Follow this guide develop and test new Sparkling Water features in Flow.
adapted from the comments on this PR h2oai#13
in the h2o-3
directory run:
cp h2o-web/src/main/resources/www/flow/js/* h2o-web/lib/h2o-flow/build/js/
in the h2o-3
directory run:
./gradlew publishToMavenLocal -x test
in sparkling-water
directory run:
./gradlew clean build -x test -x integTest
in sparkling-water
directory run:
bin/sparkling-shell
in the sparkling water shell
at the scala>
prompt run:
import org.apache.spark.h2o._
H2OContext.getOrCreate(sc)
now open Flow at the IP address specified
in the sparkling water shell
now test your changes in Flow