From e1915994bd69d109f673a0005607e5da898f694e Mon Sep 17 00:00:00 2001 From: Ryan Nett Date: Mon, 25 Jan 2021 15:20:16 -0800 Subject: [PATCH 1/8] Don't force test runs in ndarray and framework Signed-off-by: Ryan Nett --- ndarray/pom.xml | 1 - tensorflow-framework/pom.xml | 1 - 2 files changed, 2 deletions(-) diff --git a/ndarray/pom.xml b/ndarray/pom.xml index d228fbdb32a..4139b8b7929 100644 --- a/ndarray/pom.xml +++ b/ndarray/pom.xml @@ -80,7 +80,6 @@ 1 false -Xmx2G -XX:MaxPermSize=256m - false **/*Test.java diff --git a/tensorflow-framework/pom.xml b/tensorflow-framework/pom.xml index 74c4c2c2084..71cb99bbb95 100644 --- a/tensorflow-framework/pom.xml +++ b/tensorflow-framework/pom.xml @@ -94,7 +94,6 @@ 1 false -Xmx2G -XX:MaxPermSize=256m - false **/*Test.java From 010753bad053be562c1e0f21f5e50a07b5b5408c Mon Sep 17 00:00:00 2001 From: Ryan Nett Date: Mon, 25 Jan 2021 15:20:51 -0800 Subject: [PATCH 2/8] gitignore bazel config files Signed-off-by: Ryan Nett --- .gitignore | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.gitignore b/.gitignore index 098ce71c656..cdbd28eca7c 100644 --- a/.gitignore +++ b/.gitignore @@ -53,3 +53,5 @@ gradleBuild .classpath **/target +.tf_configure.bazelrc +.clwb/ From 9188d29206c875fbe95fc170c1789b7b92aac2a8 Mon Sep 17 00:00:00 2001 From: Ryan Nett Date: Mon, 25 Jan 2021 15:57:09 -0800 Subject: [PATCH 3/8] Add CONTRIBUTING.md Signed-off-by: Ryan Nett --- CONTRIBUTING.md | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ README.md | 27 ++++++++------------------- 2 files changed, 57 insertions(+), 19 deletions(-) create mode 100644 CONTRIBUTING.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000000..98963b40fe5 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,49 @@ +# Building and contributing to TensorFlow Java + +## Building + +To build all the artifacts, simply invoke the command `mvn install` at the root of this repository (or +the Maven command of your choice). It is also possible to build artifacts with support for MKL enabled with +`mvn install -Djavacpp.platform.extension=-mkl` or CUDA with `mvn install -Djavacpp.platform.extension=-gpu` +or both with `mvn install -Djavacpp.platform.extension=-mkl-gpu`. + +When building this project for the first time in a given workspace, the script will attempt to download +the [TensorFlow runtime library sources](https://github.com/tensorflow/tensorflow) and build of all the native code +for your platform. This requires a valid environment for building TensorFlow, including the [bazel](https://bazel.build/) +build tool and a few Python dependencies (please read [TensorFlow documentation](https://www.tensorflow.org/install/source) +for more details). + +This step can take multiple hours on a regular laptop. It is possible though to skip completely the native build if you are +working on a version that already has pre-compiled native artifacts for your platform [available on Sonatype OSS Nexus repository](#Snapshots). +You just need to activate the `dev` profile in your Maven command to use those artifacts instead of building them from scratch +(e.g. `mvn install -Pdev`). + +Note that modifying any source files under `tensorflow-core` may impact the low-level TensorFlow bindings, in which case a +complete build could be required to reflect the changes. + + +## Running Tests + +`ndarray` can be tested using the maven `test` target. `tensorflow-core` and `tensorflow-framework`, however, +should be tested using the `integration-test` target, due to the need to include native binaries. +It will **not** be ran when using the `test` target of parent projects, but will be ran by `install` or `integration-test`. + +## Contributing + +### Formatting + +Java sources should be formatted according to the [Google style guide](https://google.github.io/styleguide/javaguide.html). +It can be included in [IntelliJ](https://github.com/google/styleguide/blob/gh-pages/intellij-java-google-style.xml) and +[Eclipse](https://github.com/google/styleguide/blob/gh-pages/eclipse-java-google-style.xml). +[Google's C++ style guide](https://google.github.io/styleguide/cppguide.html) should also be used for C++ code. + +### Working with Bazel generation + +`tensorflow-core-api` uses C++ code generation that is built with Bazel. To get it to build, you will likely need to clone the +`tensorflow` project, run its configuration script (`./configure`), and copy the resulting `.tf_configure.bazelrc` to `tensorflow-core-api`. + +To run the code generation, use the `//:java_op_generator` target. The resulting binary has good help text (viewable in +[op_gen_main.cc](tensorflow-core/tensorflow-core-api/src/bazel/op_generator/op_gen_main.cc#L31-L48)). +Generally, it should be called with arguments that are something like `bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/libtensorflow_cc.so +--output_dir=src/gen/java --api_dirs=bazel-tensorflow-core-api/external/org_tensorflow/tensorflow/core/api_def/base_api,src/bazel/api_def` +(from `tensorflow-core-api`). diff --git a/README.md b/README.md index bca18d1fe49..ec97ea648e6 100644 --- a/README.md +++ b/README.md @@ -34,26 +34,17 @@ The following describes the layout of the repository and its different artifacts * Intended audience: any developer who needs a Java n-dimensional array implementation, whether or not they use it with TensorFlow -## Building Sources -To build all the artifacts, simply invoke the command `mvn install` at the root of this repository (or -the Maven command of your choice). It is also possible to build artifacts with support for MKL enabled with -`mvn install -Djavacpp.platform.extension=-mkl` or CUDA with `mvn install -Djavacpp.platform.extension=-gpu` -or both with `mvn install -Djavacpp.platform.extension=-mkl-gpu`. +## Communication -When building this project for the first time in a given workspace, the script will attempt to download -the [TensorFlow runtime library sources](https://github.com/tensorflow/tensorflow) and build of all the native code -for your platform. This requires a valid environment for building TensorFlow, including the [bazel](https://bazel.build/) -build tool and a few Python dependencies (please read [TensorFlow documentation](https://www.tensorflow.org/install/source) -for more details). +This repository is maintained by TensorFlow JVM Special Interest Group (SIG). You can easily join the group +by subscribing to the [jvm@tensorflow.org](https://groups.google.com/a/tensorflow.org/forum/#!forum/jvm) +mailing list, or you can simply send pull requests and raise issues to this repository. +There is also a [sig-jvm Gitter channel](https://gitter.im/tensorflow/sig-jvm). -This step can take multiple hours on a regular laptop. It is possible though to skip completely the native build if you are -working on a version that already has pre-compiled native artifacts for your platform [available on Sonatype OSS Nexus repository](#Snapshots). -You just need to activate the `dev` profile in your Maven command to use those artifacts instead of building them from scratch -(e.g. `mvn install -Pdev`). +## Building Sources -Note that modifying any source files under `tensorflow-core` may impact the low-level TensorFlow bindings, in which case a -complete build could be required to reflect the changes. +See [CONTRIBUTING.md](CONTRIBUTING.md#building). ## Using Maven Artifacts @@ -162,6 +153,4 @@ This table shows the mapping between different version of TensorFlow for Java an ## How to Contribute? -This repository is maintained by TensorFlow JVM Special Interest Group (SIG). You can easily join the group -by subscribing to the [jvm@tensorflow.org](https://groups.google.com/a/tensorflow.org/forum/#!forum/jvm) -mailing list, or you can simply send pull requests and raise issues to this repository. +Contributions are welcome, guidelines are located in [CONTRIBUTING.md](CONTRIBUTING.md). From 1663cf9698f5d234031846369daaa26354c1abcf Mon Sep 17 00:00:00 2001 From: Ryan Nett Date: Mon, 25 Jan 2021 16:09:08 -0800 Subject: [PATCH 4/8] Add note about code generation Signed-off-by: Ryan Nett --- CONTRIBUTING.md | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 98963b40fe5..014cf4035c6 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -37,13 +37,21 @@ It can be included in [IntelliJ](https://github.com/google/styleguide/blob/gh-pa [Eclipse](https://github.com/google/styleguide/blob/gh-pages/eclipse-java-google-style.xml). [Google's C++ style guide](https://google.github.io/styleguide/cppguide.html) should also be used for C++ code. +### Code generation + +Code generation for `Ops` and related classes is done during `tensorflow-core-api`'s `install`, using the annotation processor in +`tensorflow-core-generator`. If you change or add any operator classes (annotated with `org.tensorflow.op.annotation.Operator`), +endpoint methods (annotated with `org.tensorflow.op.annotation.Endpoint`), or change the annotation processor, be sure to re-run a +full `mvn install` in `tensorflow-core-api`. + ### Working with Bazel generation -`tensorflow-core-api` uses C++ code generation that is built with Bazel. To get it to build, you will likely need to clone the -`tensorflow` project, run its configuration script (`./configure`), and copy the resulting `.tf_configure.bazelrc` to `tensorflow-core-api`. +`tensorflow-core-api` uses Bazel-built C++ code generation to generate most of the `@Operator` classes. To get it to build, you will likely need to +clone the [tensorflow](https://github.com/tensorflow/tensorflow) project, run its configuration script (`./configure`), and copy the resulting +`.tf_configure.bazelrc` to `tensorflow-core-api`. To run the code generation, use the `//:java_op_generator` target. The resulting binary has good help text (viewable in [op_gen_main.cc](tensorflow-core/tensorflow-core-api/src/bazel/op_generator/op_gen_main.cc#L31-L48)). -Generally, it should be called with arguments that are something like `bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/libtensorflow_cc.so +Genrally, it should be called with arguments that are something like `bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/libtensorflow_cc.so --output_dir=src/gen/java --api_dirs=bazel-tensorflow-core-api/external/org_tensorflow/tensorflow/core/api_def/base_api,src/bazel/api_def` (from `tensorflow-core-api`). From 95c32e44451c05a71724b9b256cc123dd7c7dc81 Mon Sep 17 00:00:00 2001 From: Ryan Nett Date: Mon, 1 Feb 2021 22:18:10 -0800 Subject: [PATCH 5/8] Updates Signed-off-by: Ryan Nett --- CONTRIBUTING.md | 92 +++++++++++++++++++++++++++++++++++-------------- 1 file changed, 67 insertions(+), 25 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 014cf4035c6..64f87fe4377 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,57 +1,99 @@ -# Building and contributing to TensorFlow Java +# Building and Contributing to TensorFlow Java ## Building -To build all the artifacts, simply invoke the command `mvn install` at the root of this repository (or -the Maven command of your choice). It is also possible to build artifacts with support for MKL enabled with +To build all the artifacts, simply invoke the command `mvn install` at the root of this repository (or the Maven command of your choice). It is also +possible to build artifacts with support for MKL enabled with `mvn install -Djavacpp.platform.extension=-mkl` or CUDA with `mvn install -Djavacpp.platform.extension=-gpu` or both with `mvn install -Djavacpp.platform.extension=-mkl-gpu`. When building this project for the first time in a given workspace, the script will attempt to download -the [TensorFlow runtime library sources](https://github.com/tensorflow/tensorflow) and build of all the native code -for your platform. This requires a valid environment for building TensorFlow, including the [bazel](https://bazel.build/) +the [TensorFlow runtime library sources](https://github.com/tensorflow/tensorflow) and build of all the native code for your platform. This requires a +valid environment for building TensorFlow, including the [bazel](https://bazel.build/) build tool and a few Python dependencies (please read [TensorFlow documentation](https://www.tensorflow.org/install/source) for more details). -This step can take multiple hours on a regular laptop. It is possible though to skip completely the native build if you are -working on a version that already has pre-compiled native artifacts for your platform [available on Sonatype OSS Nexus repository](#Snapshots). -You just need to activate the `dev` profile in your Maven command to use those artifacts instead of building them from scratch +This step can take multiple hours on a regular laptop. It is possible though to skip completely the native build if you are working on a version that +already has pre-compiled native artifacts for your platform [available on Sonatype OSS Nexus repository](#Snapshots). You just need to activate +the `dev` profile in your Maven command to use those artifacts instead of building them from scratch (e.g. `mvn install -Pdev`). -Note that modifying any source files under `tensorflow-core` may impact the low-level TensorFlow bindings, in which case a -complete build could be required to reflect the changes. +Modifying the native op generation code (not the annotation processor) or the JavaCPP configuration (not the abstract Pointers) will require a +complete build could be required to reflect the changes, otherwise `-Pdev` should be fine. +### GPU Support + +Currently, due to build time constraints, the GPU binaries only support compute capacities 3.5 and 7.0. +To use with un-supported GPUs, change the value [here](tensorflow-core/tensorflow-core-api/build.sh#L27) and build the binaries yourself. While this +is far from ideal, we are working on getting more build resources, and for now this is the best option. + +To build for GPU, pass `-Djavacpp.platform.extension=-gpu` to maven. By default, the CI options are used for the bazel build. +Using Tensorflow's configure script and copying the resulting `.tf_configure.bazelrc` to `tensorflow-core-api` can be used to override these options ( +like cuda locations). See the [Working with Bazel generation](#working-with-bazel-generation) section for details. If you do this, make sure +the `TF_CUDA_COMPUTE_CAPABILITIES` value in your `.tf_configure.bazelrc` matches the value set in `build.sh`. ## Running Tests -`ndarray` can be tested using the maven `test` target. `tensorflow-core` and `tensorflow-framework`, however, -should be tested using the `integration-test` target, due to the need to include native binaries. -It will **not** be ran when using the `test` target of parent projects, but will be ran by `install` or `integration-test`. +`ndarray` can be tested using the maven `test` target. `tensorflow-core` and `tensorflow-framework`, however, should be tested using +the `integration-test` target, due to the need to include native binaries. It will **not** be ran when using the `test` target of parent projects, but +will be ran by `install` or `integration-test`. If you see a `no jnitensorflow in java.library.path` error from tests it is likely because you're +running the wrong test target. + +### Native Crashes + +Occasionally tests will fail with a message like: + +``` +Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.22.0:test(default-test)on project tensorflow-core-api:There are test failures. + + Please refer to C:\mpicbg\workspace\tensorflow\java\tensorflow-core\tensorflow-core-api\target\surefire-reports for the individual test results. + Please refer to dump files(if any exist)[date]-jvmRun[N].dump,[date].dumpstream and[date]-jvmRun[N].dumpstream. + The forked VM terminated without properly saying goodbye.VM crash or System.exit called? + Command was cmd.exe/X/C"C:\Users\me\.jdks\adopt-openj9-1.8.0_275\jre\bin\java -jar C:\Users\me\AppData\Local\Temp\surefire236563113746082396\surefirebooter5751859365434514212.jar C:\Users\me\AppData\Local\Temp\surefire236563113746082396 2020-12-18T13-57-26_766-jvmRun1 surefire2445852067572510918tmp surefire_05950149004635894208tmp" + Error occurred in starting fork,check output in log + Process Exit Code:-1 + Crashed tests: + org.tensorflow.TensorFlowTest + org.apache.maven.surefire.booter.SurefireBooterForkException:The forked VM terminated without properly saying goodbye.VM crash or System.exit called? + Command was cmd.exe/X/C"C:\Users\me\.jdks\adopt-openj9-1.8.0_275\jre\bin\java -jar C:\Users\me\AppData\Local\Temp\surefire236563113746082396\surefirebooter5751859365434514212.jar C:\Users\me\AppData\Local\Temp\surefire236563113746082396 2020-12-18T13-57-26_766-jvmRun1 surefire2445852067572510918tmp surefire_05950149004635894208tmp" + Error occurred in starting fork,check output in log + Process Exit Code:-1 + Crashed tests: + org.tensorflow.TensorFlowTest + at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:671) + at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:533) + at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:278) + at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:244) +``` + +This is because the native code crashed (i.e. because of a segfault), and it should have created a dump file somewhere in the project that you can use +to tell what caused the issue. ## Contributing ### Formatting -Java sources should be formatted according to the [Google style guide](https://google.github.io/styleguide/javaguide.html). -It can be included in [IntelliJ](https://github.com/google/styleguide/blob/gh-pages/intellij-java-google-style.xml) and +Java sources should be formatted according to the [Google style guide](https://google.github.io/styleguide/javaguide.html). It can be included +in [IntelliJ](https://github.com/google/styleguide/blob/gh-pages/intellij-java-google-style.xml) and [Eclipse](https://github.com/google/styleguide/blob/gh-pages/eclipse-java-google-style.xml). [Google's C++ style guide](https://google.github.io/styleguide/cppguide.html) should also be used for C++ code. ### Code generation -Code generation for `Ops` and related classes is done during `tensorflow-core-api`'s `install`, using the annotation processor in -`tensorflow-core-generator`. If you change or add any operator classes (annotated with `org.tensorflow.op.annotation.Operator`), -endpoint methods (annotated with `org.tensorflow.op.annotation.Endpoint`), or change the annotation processor, be sure to re-run a -full `mvn install` in `tensorflow-core-api`. +Code generation for `Ops` and related classes is done during `tensorflow-core-api`'s `compile` phase, using the annotation processor in +`tensorflow-core-generator`. If you change or add any operator classes (annotated with `org.tensorflow.op.annotation.Operator`), endpoint methods ( +annotated with `org.tensorflow.op.annotation.Endpoint`), or change the annotation processor, be sure to re-run a +`mvn install` in `tensorflow-core-api` (`-Pdev` is fine for this, it just needs to run the annotation processor). ### Working with Bazel generation -`tensorflow-core-api` uses Bazel-built C++ code generation to generate most of the `@Operator` classes. To get it to build, you will likely need to -clone the [tensorflow](https://github.com/tensorflow/tensorflow) project, run its configuration script (`./configure`), and copy the resulting +`tensorflow-core-api` uses Bazel-built C++ code generation to generate most of the `@Operator` classes. +By default, the bazel build is configured for the [CI](.github/workflows/ci.yml), so if you're building locally, you may need to clone +the [tensorflow](https://github.com/tensorflow/tensorflow) project, run its configuration script (`./configure`), and copy the resulting `.tf_configure.bazelrc` to `tensorflow-core-api`. -To run the code generation, use the `//:java_op_generator` target. The resulting binary has good help text (viewable in -[op_gen_main.cc](tensorflow-core/tensorflow-core-api/src/bazel/op_generator/op_gen_main.cc#L31-L48)). -Genrally, it should be called with arguments that are something like `bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/libtensorflow_cc.so ---output_dir=src/gen/java --api_dirs=bazel-tensorflow-core-api/external/org_tensorflow/tensorflow/core/api_def/base_api,src/bazel/api_def` +To run the code generation, use the `//:java_op_generator` target. The resulting binary has good help text (viewable in +[op_gen_main.cc](tensorflow-core/tensorflow-core-api/src/bazel/op_generator/op_gen_main.cc#L31-L48)). Generally, it should be called with arguments +that are something +like `bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/libtensorflow_cc.so --output_dir=src/gen/java --api_dirs=bazel-tensorflow-core-api/external/org_tensorflow/tensorflow/core/api_def/base_api,src/bazel/api_def` (from `tensorflow-core-api`). From 46e67b60cece6e671b891bdd8a93738f2caa88e0 Mon Sep 17 00:00:00 2001 From: Ryan Nett Date: Tue, 2 Feb 2021 11:59:27 -0800 Subject: [PATCH 6/8] Use set TF_CUDA_COMPUTE_CAPABILITIES by default Signed-off-by: Ryan Nett --- CONTRIBUTING.md | 16 +++++++++------- tensorflow-core/tensorflow-core-api/build.sh | 2 +- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 64f87fe4377..08a40ebb683 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -24,13 +24,14 @@ complete build could be required to reflect the changes, otherwise `-Pdev` shoul ### GPU Support Currently, due to build time constraints, the GPU binaries only support compute capacities 3.5 and 7.0. -To use with un-supported GPUs, change the value [here](tensorflow-core/tensorflow-core-api/build.sh#L27) and build the binaries yourself. While this -is far from ideal, we are working on getting more build resources, and for now this is the best option. +To use with un-supported GPUs, you have to build it yourself, after changing the value [here](tensorflow-core/tensorflow-core-api/build.sh#L27), +setting the environment variable `TF_CUDA_COMPUTE_CAPABILITIES`, or configuring it in a bazel rc file ( +i.e. `build --action_env TF_CUDA_COMPUTE_CAPABILITIES="6.1"`). While this is far from ideal, we are working on getting more build resources, and for +now this is the best option. -To build for GPU, pass `-Djavacpp.platform.extension=-gpu` to maven. By default, the CI options are used for the bazel build. -Using Tensorflow's configure script and copying the resulting `.tf_configure.bazelrc` to `tensorflow-core-api` can be used to override these options ( -like cuda locations). See the [Working with Bazel generation](#working-with-bazel-generation) section for details. If you do this, make sure -the `TF_CUDA_COMPUTE_CAPABILITIES` value in your `.tf_configure.bazelrc` matches the value set in `build.sh`. +To build for GPU, pass `-Djavacpp.platform.extension=-gpu` to maven. By default, the CI options are used for the bazel build. You can override these +options, see the [Working with Bazel generation](#working-with-bazel-generation) section for details. If you do this, make sure +the `TF_CUDA_COMPUTE_CAPABILITIES` value in your `.tf_configure.bazelrc` matches the value set elsewhere, as it will take precedence if present. ## Running Tests @@ -90,7 +91,8 @@ annotated with `org.tensorflow.op.annotation.Endpoint`), or change the annotatio `tensorflow-core-api` uses Bazel-built C++ code generation to generate most of the `@Operator` classes. By default, the bazel build is configured for the [CI](.github/workflows/ci.yml), so if you're building locally, you may need to clone the [tensorflow](https://github.com/tensorflow/tensorflow) project, run its configuration script (`./configure`), and copy the resulting -`.tf_configure.bazelrc` to `tensorflow-core-api`. +`.tf_configure.bazelrc` to `tensorflow-core-api`. This overrides the default options, and you can add to it manually (i.e. adding `build --copt="-g"` +to build with debugging info). To run the code generation, use the `//:java_op_generator` target. The resulting binary has good help text (viewable in [op_gen_main.cc](tensorflow-core/tensorflow-core-api/src/bazel/op_generator/op_gen_main.cc#L31-L48)). Generally, it should be called with arguments diff --git a/tensorflow-core/tensorflow-core-api/build.sh b/tensorflow-core/tensorflow-core-api/build.sh index 356a00db91d..83b80a8ab8a 100755 --- a/tensorflow-core/tensorflow-core-api/build.sh +++ b/tensorflow-core/tensorflow-core-api/build.sh @@ -24,7 +24,7 @@ fi if [[ "${EXTENSION:-}" == *gpu* ]]; then export BUILD_FLAGS="$BUILD_FLAGS --config=cuda" - export TF_CUDA_COMPUTE_CAPABILITIES="3.5,7.0" + export TF_CUDA_COMPUTE_CAPABILITIES="${TF_CUDA_COMPUTE_CAPABILITIES:-'3.5,7.0'}" if [[ -z ${TF_CUDA_PATHS:-} ]] && [[ -d ${CUDA_PATH:-} ]]; then # Work around some issue with Bazel preventing it from detecting CUDA on Windows export TF_CUDA_PATHS="$CUDA_PATH" From c8c6ec1a2f7751ebb92a89089157d28805101284 Mon Sep 17 00:00:00 2001 From: Ryan Nett Date: Wed, 3 Feb 2021 13:40:38 -0800 Subject: [PATCH 7/8] Add dedicated native builds section Signed-off-by: Ryan Nett --- CONTRIBUTING.md | 35 ++++++++++++++++++++++------------- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 08a40ebb683..d13e1c4aedb 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -21,6 +21,15 @@ the `dev` profile in your Maven command to use those artifacts instead of buildi Modifying the native op generation code (not the annotation processor) or the JavaCPP configuration (not the abstract Pointers) will require a complete build could be required to reflect the changes, otherwise `-Pdev` should be fine. +### Native Builds + +In some cases, like when adding GPU support or re-generating op classes, you will need to re-build the native library. 99% of this is building +TensorFlow, which by default is configured for the [CI](.github/workflows/ci.yml). The build configuration can be customized using the same methods as +TensorFlow, so if you're building locally, you may need to clone the [tensorflow](https://github.com/tensorflow/tensorflow) project, run its +configuration script (`./configure`), and copy the resulting +`.tf_configure.bazelrc` to `tensorflow-core-api`. This overrides the default options, and you can add to it manually (i.e. adding `build --copt="-g"` +to build with debugging info). + ### GPU Support Currently, due to build time constraints, the GPU binaries only support compute capacities 3.5 and 7.0. @@ -29,9 +38,9 @@ setting the environment variable `TF_CUDA_COMPUTE_CAPABILITIES`, or configuring i.e. `build --action_env TF_CUDA_COMPUTE_CAPABILITIES="6.1"`). While this is far from ideal, we are working on getting more build resources, and for now this is the best option. -To build for GPU, pass `-Djavacpp.platform.extension=-gpu` to maven. By default, the CI options are used for the bazel build. You can override these -options, see the [Working with Bazel generation](#working-with-bazel-generation) section for details. If you do this, make sure -the `TF_CUDA_COMPUTE_CAPABILITIES` value in your `.tf_configure.bazelrc` matches the value set elsewhere, as it will take precedence if present. +To build for GPU, pass `-Djavacpp.platform.extension=-gpu` to maven. By default, the CI options are used for the bazel build, see the above section +for more info. If you add `bazelrc` files, make sure the `TF_CUDA_COMPUTE_CAPABILITIES` value in them matches the value set elsewhere, as it will take +precedence if present. ## Running Tests @@ -88,14 +97,14 @@ annotated with `org.tensorflow.op.annotation.Endpoint`), or change the annotatio ### Working with Bazel generation -`tensorflow-core-api` uses Bazel-built C++ code generation to generate most of the `@Operator` classes. -By default, the bazel build is configured for the [CI](.github/workflows/ci.yml), so if you're building locally, you may need to clone -the [tensorflow](https://github.com/tensorflow/tensorflow) project, run its configuration script (`./configure`), and copy the resulting -`.tf_configure.bazelrc` to `tensorflow-core-api`. This overrides the default options, and you can add to it manually (i.e. adding `build --copt="-g"` -to build with debugging info). - -To run the code generation, use the `//:java_op_generator` target. The resulting binary has good help text (viewable in +`tensorflow-core-api` uses Bazel-built C++ code generation to generate most of the `@Operator` classes. See [Native Builds](#native-builds) for +instructions on configuring the bazel build. To run the code generation, use the `//:java_op_generator` target. The resulting binary has good help +text (viewable in [op_gen_main.cc](tensorflow-core/tensorflow-core-api/src/bazel/op_generator/op_gen_main.cc#L31-L48)). Generally, it should be called with arguments -that are something -like `bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/libtensorflow_cc.so --output_dir=src/gen/java --api_dirs=bazel-tensorflow-core-api/external/org_tensorflow/tensorflow/core/api_def/base_api,src/bazel/api_def` -(from `tensorflow-core-api`). +that are something like: + +``` +bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/libtensorflow_cc.so --output_dir=src/gen/java --api_dirs=bazel-tensorflow-core-api/external/org_tensorflow/tensorflow/core/api_def/base_api,src/bazel/api_def +``` + +(called in `tensorflow-core-api`). From 003826559f4b217acb36a3d517ec393626c65fd3 Mon Sep 17 00:00:00 2001 From: Ryan Nett Date: Wed, 3 Feb 2021 22:09:50 -0800 Subject: [PATCH 8/8] Fix quoting Signed-off-by: Ryan Nett --- tensorflow-core/tensorflow-core-api/build.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tensorflow-core/tensorflow-core-api/build.sh b/tensorflow-core/tensorflow-core-api/build.sh index 83b80a8ab8a..e94efa850d8 100755 --- a/tensorflow-core/tensorflow-core-api/build.sh +++ b/tensorflow-core/tensorflow-core-api/build.sh @@ -24,7 +24,7 @@ fi if [[ "${EXTENSION:-}" == *gpu* ]]; then export BUILD_FLAGS="$BUILD_FLAGS --config=cuda" - export TF_CUDA_COMPUTE_CAPABILITIES="${TF_CUDA_COMPUTE_CAPABILITIES:-'3.5,7.0'}" + export TF_CUDA_COMPUTE_CAPABILITIES="${TF_CUDA_COMPUTE_CAPABILITIES:-"3.5,7.0"}" if [[ -z ${TF_CUDA_PATHS:-} ]] && [[ -d ${CUDA_PATH:-} ]]; then # Work around some issue with Bazel preventing it from detecting CUDA on Windows export TF_CUDA_PATHS="$CUDA_PATH"