Skip to content

Commit

Permalink
_halide_buffer_crop() needs to check for runtime failures (v2) (#6403)
Browse files Browse the repository at this point in the history
* _halide_buffer_crop() needs to check for runtime failures (v2)

(Alternate to #6402)

We currently assume that _halide_buffer_crop() will never fail. This is a bad assumption, as it can call device_crop(), which can fail due to unexpected runtime errors, or from a backend simply leaving the device_crop field at the default (unimplemented) case (as is currently the case for the OGLC backend).

When this happens, the dst buffer was left in an inconsistent, invalid state (which was what led to the crashes fixed by #6401).

This change modifies _halide_buffer_crop() to return nullptr in the event of an error, and ensure that all cropped buffers are checked for null at the right point. (This is not optimal, of course, since the specific error returned by device_crop is getting dropped on the floor, but the existence of an error is no longer ignored.)

This addresses at least some of the failure issues we are seeing in performance_async_gpu with the OpenGLCompute backend.

(Also: drive-by whitespace fix in CodegenC)

* Oops
  • Loading branch information
steven-johnson authored Nov 11, 2021
1 parent d343e76 commit 9ff87ce
Show file tree
Hide file tree
Showing 6 changed files with 34 additions and 5 deletions.
2 changes: 1 addition & 1 deletion src/CodeGen_C.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2372,7 +2372,7 @@ void CodeGen_C::visit(const Call *op) {
string size = print_expr(simplify((op->args[0] + 7) / 8));
stream << get_indent();
string array_name = unique_name('a');
stream << "uint64_t " << array_name << "[" << size << "];";
stream << "uint64_t " << array_name << "[" << size << "];\n";
rhs << "(" << print_type(op->type) << ")(&" << array_name << ")";
}
} else if (op->is_intrinsic(Call::make_struct)) {
Expand Down
14 changes: 14 additions & 0 deletions src/ScheduleFunctions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -783,6 +783,20 @@ Stmt build_extern_produce(const map<string, Function> &env, Function f, const Ta
Stmt check = AssertStmt::make(EQ::make(result, 0), error);

if (!cropped_buffers.empty()) {
// We need to check that all cropped buffers are non-null (since Call::buffer_crop can return nullptr)
for (const auto &p : cropped_buffers) {
Expr cropped = p.first;
Expr cropped_u64 = reinterpret(UInt(64), cropped);
Expr error = Call::make(Int(32), "halide_error_device_crop_failed", std::vector<Expr>(), Call::Extern);
Stmt assertion = AssertStmt::make(cropped_u64 != 0, error);

if (!is_no_op(pre_call)) {
pre_call = Block::make(pre_call, assertion);
} else {
pre_call = assertion;
}
}

// We need to clean up the temporary crops we made for the
// outputs in case any of them have device allocations.
vector<Expr> cleanup_args;
Expand Down
1 change: 1 addition & 0 deletions src/runtime/HalideRuntime.h
Original file line number Diff line number Diff line change
Expand Up @@ -1250,6 +1250,7 @@ extern int halide_error_buffer_is_null(void *user_context, const char *routine);
extern int halide_error_device_dirty_with_no_device_support(void *user_context, const char *buffer_name);
extern int halide_error_storage_bound_too_small(void *user_context, const char *func_name, const char *var_name,
int provided_size, int required_size);
extern int halide_error_device_crop_failed(void *user_context);
// @}

/** Optional features a compilation Target can have.
Expand Down
5 changes: 5 additions & 0 deletions src/runtime/errors.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -286,4 +286,9 @@ WEAK int halide_error_storage_bound_too_small(void *user_context, const char *fu
return halide_error_code_storage_bound_too_small;
}

WEAK int halide_error_device_crop_failed(void *user_context) {
error(user_context) << "Buffer could not be cropped (runtime error or unimplemented device option).\n";
return halide_error_code_device_crop_failed;
}

} // extern "C"
16 changes: 12 additions & 4 deletions src/runtime/halide_buffer_t.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -157,8 +157,11 @@ halide_buffer_t *_halide_buffer_crop(void *user_context,
dst->device = 0;
if (src->device_interface) {
if (src->device_interface->device_crop(user_context, src, dst) != 0) {
// No way to report it: just ignore it
// debug(user_context) << "_halide_buffer_crop: device_crop failed\n";
// This is uncommon: either a runtime error, or a backend that
// doesn't replace the default definition of device_crop. But it
// does happen, so let's return a nullptr here, and require the caller
// to check the result.
return nullptr;
}
}
return dst;
Expand Down Expand Up @@ -222,8 +225,13 @@ int _halide_buffer_retire_crops_after_extern_stage(void *user_context,
HALIDE_BUFFER_HELPER_ATTRS
halide_buffer_t *_halide_buffer_set_bounds(halide_buffer_t *buf,
int dim, int min, int extent) {
buf->dim[dim].min = min;
buf->dim[dim].extent = extent;
// This can be called with the result of _halide_buffer_crop(), which
// can return nullptr if an error occurs -- so don't crash, just propagate
// the nullptr result to our caller.
if (buf != nullptr) {
buf->dim[dim].min = min;
buf->dim[dim].extent = extent;
}
return buf;
}
}
Expand Down
1 change: 1 addition & 0 deletions src/runtime/runtime_api.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ extern "C" __attribute__((used)) void *halide_runtime_api_functions[] = {
(void *)&halide_error_specialize_fail,
(void *)&halide_error_unaligned_host_ptr,
(void *)&halide_error_storage_bound_too_small,
(void *)&halide_error_device_crop_failed,
(void *)&halide_float16_bits_to_double,
(void *)&halide_float16_bits_to_float,
(void *)&halide_free,
Expand Down

0 comments on commit 9ff87ce

Please sign in to comment.