Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[air/benchmarks] Fix typo in tensorflow_benchmark.py script preventing proper error surfacing #32269

Merged
merged 1 commit into from
Feb 7, 2023

Conversation

krfricke
Copy link
Contributor

@krfricke krfricke commented Feb 7, 2023

Signed-off-by: Kai Fricke kai@anyscale.com

Why are these changes needed?

There is a small typo in the tensorflow_benchmark.py script that does not properly catch when a vanilla TF run failed three times. Because of this, we would previously record a training time of 0.0 for vanilla TF, which skews the calculated average and suggests that vanilla TF outperformed Ray Train. Instead, we should have raised an error message to surface the problem.

Related issue number

Closes #31882

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

…g proper error surfacing

Signed-off-by: Kai Fricke <kai@anyscale.com>
@krfricke krfricke merged commit c83111a into ray-project:master Feb 7, 2023
@krfricke krfricke deleted the air/benchmark-tf-typo branch February 7, 2023 18:31
scv119 pushed a commit to scv119/ray that referenced this pull request Feb 7, 2023
…g proper error surfacing (ray-project#32269)

There is a small typo in the tensorflow_benchmark.py script that does not properly catch when a vanilla TF run failed three times. Because of this, we would previously record a training time of 0.0 for vanilla TF, which skews the calculated average and suggests that vanilla TF outperformed Ray Train. Instead, we should have raised an error message to surface the problem.

Signed-off-by: Kai Fricke <kai@anyscale.com>
cadedaniel pushed a commit to cadedaniel/ray that referenced this pull request Mar 22, 2023
…g proper error surfacing (ray-project#32269)

There is a small typo in the tensorflow_benchmark.py script that does not properly catch when a vanilla TF run failed three times. Because of this, we would previously record a training time of 0.0 for vanilla TF, which skews the calculated average and suggests that vanilla TF outperformed Ray Train. Instead, we should have raised an error message to surface the problem.

Signed-off-by: Kai Fricke <kai@anyscale.com>
edoakes pushed a commit to edoakes/ray that referenced this pull request Mar 22, 2023
…g proper error surfacing (ray-project#32269)

There is a small typo in the tensorflow_benchmark.py script that does not properly catch when a vanilla TF run failed three times. Because of this, we would previously record a training time of 0.0 for vanilla TF, which skews the calculated average and suggests that vanilla TF outperformed Ray Train. Instead, we should have raised an error message to surface the problem.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
scottsun94 pushed a commit to scottsun94/ray that referenced this pull request Mar 28, 2023
…g proper error surfacing (ray-project#32269)

There is a small typo in the tensorflow_benchmark.py script that does not properly catch when a vanilla TF run failed three times. Because of this, we would previously record a training time of 0.0 for vanilla TF, which skews the calculated average and suggests that vanilla TF outperformed Ray Train. Instead, we should have raised an error message to surface the problem.

Signed-off-by: Kai Fricke <kai@anyscale.com>
cassidylaidlaw pushed a commit to cassidylaidlaw/ray that referenced this pull request Mar 28, 2023
…g proper error surfacing (ray-project#32269)

There is a small typo in the tensorflow_benchmark.py script that does not properly catch when a vanilla TF run failed three times. Because of this, we would previously record a training time of 0.0 for vanilla TF, which skews the calculated average and suggests that vanilla TF outperformed Ray Train. Instead, we should have raised an error message to surface the problem.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[release][air] Release test failure: [unstable] air_benchmark_tensorflow_mnist_gpu_4x4 [smoke test]
3 participants