Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYSTEMML-1325] Cleanup static variables in DMLScript #832

Closed
wants to merge 13 commits into from

Conversation

niketanpansare
Copy link
Contributor

We use ThreadLocal DMLOptions and DMLConfig instead of static variables in DMLScript class. It allows different JMLC instances (or MLCContext instances) to run with different options (such as with GPU, with CPU, etc).

@nakul02 @thomas9t @mboehm7 Please let me know if you have suggestions or concerns regarding this PR?

@thomas9t
Copy link
Contributor

@niketanpansare - it appears that statistics are no longer printed as expected when run via the command line:

Expected behavior is a printout of detailed SystemML statistics. However, only total execution time and number of spark instructions are printed.
spark-submit --jars systemml-1.3.0-SNAPSHOT-extra.jar SystemML.jar -f test-script.dml -explain -stats

Niketan Pansare added 2 commits August 24, 2018 14:17
@thomas9t
Copy link
Contributor

thomas9t commented Aug 24, 2018

A couple issues with MLContext:

spark-shell --jars SystemML.jar,systemml-1.3.0-SNAPSHOT-extra.jar

import org.apache.sysml.api.mlcontext._
import org.apache.sysml.api.mlcontext.ScriptFactory._

val ml = new MLContext(spark)
val script = """
    X = rand(rows=10, cols=10)
    print("TOTAL: " + sum(X))
"""
val dmlScript = dml(script)

ml.setStatistics(true)
ml.execute(dmlScript)

org.apache.sysml.api.mlcontext.MLContextException: Exception when executing script
  at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:346)
  at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:319)
  ... 52 elided
Caused by: java.lang.RuntimeException: Multiple statistics configuration is not supported
  at org.apache.sysml.conf.ConfigurationManager.setStatistics(ConfigurationManager.java:288)
  at org.apache.sysml.conf.DMLOptions.<init>(DMLOptions.java:78)
  at org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:339)
  at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:342)
  ... 53 more

And

spark-shell --jars SystemML.jar,systemml-1.3.0-SNAPSHOT-extra.jar

import org.apache.sysml.api.mlcontext._
import org.apache.sysml.api.mlcontext.ScriptFactory._

val ml = new MLContext(spark)
val script = """
    X = rand(rows=10, cols=10)
    print("TOTAL: " + sum(X))
"""
val dmlScript = dml(script)

ml.setExplain(true)
ml.execute(dmlScript)

org.apache.sysml.api.mlcontext.MLContextException: Exception when executing script
  at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:346)
  at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:319)
  ... 52 elided
Caused by: java.lang.NullPointerException
  at org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:337)
  at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:342)
  ... 53 more

and finegrained statistics before execution
@niketanpansare
Copy link
Contributor Author

Thanks @thomas9t for catching that. Delivered the fix for that.

@thomas9t
Copy link
Contributor

Thanks @niketanpansare - I confirm that statistics are now printed as expected when run via the command line and in MLContext.

@thomas9t
Copy link
Contributor

I ran a few sanity checks on GPU behavior for MLContext and DMLScript:

  1. extra.jar is included and GPU is not enabled -> no exception and GPU is not used
  2. extra.jar is included and GPU is enabled -> no exception and GPU may be used
  3. extra.jar is included and GPU is forced -> no exception and GPU is used
  4. extra.jar is not included and GPU is not enabled -> no exception and GPU is not used
  5. extra.jar is not included and GPU is enabled -> exception
  6. extra.jar is not included and GPU is forced -> exception

Both APIs behave as expected. We can perform the same checks for JMLC as part of #830

@thomas9t
Copy link
Contributor

@niketanpansare - I confirm that 69b991 fixed the NPE in MLContext. I also ran the standard test suite and didn't encounter any errors. I see no other issues. LGTM 👍

@niketanpansare
Copy link
Contributor Author

Thanks @thomas9t. Will merge 👍

@asfgit asfgit closed this in ae268a9 Aug 27, 2018
thomas9t added a commit to thomas9t/systemml that referenced this pull request Aug 28, 2018
niketanpansare pushed a commit to niketanpansare/systemml that referenced this pull request Oct 14, 2018
Support to JMLC

This PR adds support for compilation and execution of GPU enabled
scripts in JMLC and harmonizes the pipeline used to compile and execute
DML programs across the JMLC, MLContext and DMLScript. Specifically, the
following changes were made:

1. All three APIs now call ScriptExecutorUtils.compileRuntimeProgram to
compile DML scripts. The original logic in MLContext and JMLC for
pinning inputs and persisting outputs has been preserved.
2. All three APIs now use ScriptExecutorUtils.executeRuntimeProgram to
execute the compiled program. Previously, JMLC called the Script.execute
method directly.
3. jmlc.Connection.prepareScript now supports compiling a script to use
GPU. Note that following apache#832 the issue noted in apache#830 has been resolved.
4. A PreparedScript is now statically assigned a GPU context when it is
compiled and instatiated. This has potential performance implications
because it means that a PreparedScript must be executed on a specific
GPU. However, it reduces overhead from creating a GPU context each time
a script is executed and unsures that a user cannot compile a script to
use GPU and then forget to assign a GPU context when the script is run.
5. Per (3) I have added a unit test which compiles and executes a GPU
enabled script in JMLC both with and without pinned data and just
asserts that no errors occur.

Closes apache#836.
asfgit pushed a commit that referenced this pull request Oct 14, 2018
Support to JMLC

This PR adds support for compilation and execution of GPU enabled
scripts in JMLC and harmonizes the pipeline used to compile and execute
DML programs across the JMLC, MLContext and DMLScript. Specifically, the
following changes were made:

1. All three APIs now call ScriptExecutorUtils.compileRuntimeProgram to
compile DML scripts. The original logic in MLContext and JMLC for
pinning inputs and persisting outputs has been preserved.
2. All three APIs now use ScriptExecutorUtils.executeRuntimeProgram to
execute the compiled program. Previously, JMLC called the Script.execute
method directly.
3. jmlc.Connection.prepareScript now supports compiling a script to use
GPU. Note that following #832 the issue noted in #830 has been resolved.
4. A PreparedScript is now statically assigned a GPU context when it is
compiled and instatiated. This has potential performance implications
because it means that a PreparedScript must be executed on a specific
GPU. However, it reduces overhead from creating a GPU context each time
a script is executed and unsures that a user cannot compile a script to
use GPU and then forget to assign a GPU context when the script is run.
5. Per (3) I have added a unit test which compiles and executes a GPU
enabled script in JMLC both with and without pinned data and just
asserts that no errors occur.

Closes #836.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants