Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GPU autoscheduler #6856

Merged
merged 72 commits into from
Apr 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
0046184
Add GPU autoscheduler
aekul Apr 5, 2022
d939ea5
clang-format
aekul Jul 28, 2022
73b055a
clang-format 13
aekul Jul 28, 2022
dfe18b8
run clang-tidy, remove MachineParams, use new autoscheduler params an…
aekul Aug 14, 2022
4ad51a2
remove commented code
aekul Aug 14, 2022
9932266
use updated api
aekul Aug 16, 2022
e4672de
use updated api
aekul Aug 16, 2022
efa1d83
clang-format
aekul Aug 16, 2022
937c730
remove MachineParams and fix parallelism parameter
aekul Aug 17, 2022
bb571dd
fix test
aekul Aug 18, 2022
bddfc57
add CMakeLists.txt
aekul Aug 18, 2022
59617c2
remove ASLog.h/cpp
aekul Aug 19, 2022
ec5a6ee
move PerfectHashMap.h to common/
aekul Aug 19, 2022
56d5e71
move test_function_dag.cpp to common/
aekul Aug 19, 2022
6e68cd4
move featurization_to_sample.cpp to common/
aekul Aug 19, 2022
7d3fbb4
move test_perfect_hash_map.cpp to common/
aekul Aug 19, 2022
ae9b216
remove Errors.h
aekul Aug 19, 2022
d7c7e5a
move get_host_target.cpp to common/
aekul Aug 19, 2022
b529aeb
move weightsdir_to_weightsfile.cpp to common/
aekul Aug 19, 2022
0a99054
remove MACHINE_PARAMS
aekul Aug 19, 2022
8a7923b
move demo_generator.cpp to common/
aekul Aug 19, 2022
bee009e
remove files from Adams2019
aekul Aug 19, 2022
1e3fa11
move included_schedule_file_generator.cpp to common/
aekul Aug 19, 2022
489509b
move Weights.h/cpp to common/
aekul Aug 20, 2022
6e8e4c1
tidy up
aekul Aug 21, 2022
1c4e2a6
add input images
aekul Sep 4, 2022
393c7b9
Merge remote-tracking branch 'upstream/main' into gpu-autoscheduler
aekul Sep 9, 2022
ebeccac
add prefix to build targets
aekul Sep 9, 2022
e6fd31a
steven's patch
aekul Sep 9, 2022
df668fb
Merge remote-tracking branch 'upstream/main' into gpu-autoscheduler
aekul Sep 10, 2022
31fad63
fix cmake error
aekul Sep 11, 2022
c4e93d8
fix Weights.cpp path
aekul Sep 12, 2022
7119a0e
Merge remote-tracking branch 'upstream/main' into gpu-autoscheduler
aekul Sep 12, 2022
82da090
Remove usage of include_directories
aekul Nov 5, 2022
fae9934
Merge remote-tracking branch 'upstream/main' into gpu-autoscheduler
aekul Dec 3, 2022
99627a5
Move tests to test directory
aekul Dec 5, 2022
4b6f020
clang-tidy/format
aekul Dec 28, 2022
e5a40ef
Merge remote-tracking branch 'upstream/main' into gpu-autoscheduler
aekul Dec 29, 2022
d29fa84
Update scripts to use CMake build directory structure
aekul Dec 31, 2022
f7fad31
Tidy up scripts/utils.sh
aekul Dec 31, 2022
ae08b76
Check if RunGenMain.o exists
aekul Dec 31, 2022
b44d05c
clang-format
aekul Dec 31, 2022
844755c
Add included_schedule_file.schedule.h
aekul Jan 2, 2023
b8e6ed0
clang-format
aekul Jan 3, 2023
f2928d3
Script usability improvements
aekul Jan 5, 2023
ff2daff
Add include path to Makefile
aekul Jan 12, 2023
9741bb3
Fix include path in Makefile
aekul Jan 13, 2023
b273cb0
Fix include path in Makefile
aekul Jan 14, 2023
d6a3656
Tidy up
aekul Jan 29, 2023
d72060e
clang-tidy
aekul Jan 30, 2023
af18231
Merge remote-tracking branch 'upstream/main' into gpu-autoscheduler
aekul Feb 1, 2023
9ec5b03
include directory
aekul Feb 6, 2023
6b6c7d8
Merge remote-tracking branch 'upstream/main' into gpu-autoscheduler
aekul Feb 6, 2023
c7bd839
Non-constant bounds fix
aekul Feb 9, 2023
ea869ce
Merge remote-tracking branch 'upstream/main' into gpu-autoscheduler
aekul Feb 14, 2023
64148a9
Fix long line
aekul Mar 17, 2023
86c4015
Add braces around if statements
aekul Mar 17, 2023
08c778e
aslog(0) -> aslog(1)
aekul Mar 17, 2023
8468f59
abort -> internal_assert
aekul Mar 17, 2023
940f644
Remove default destructor
aekul Mar 17, 2023
c860696
Fix long line
aekul Mar 17, 2023
1df7dc2
Fix long lines
aekul Mar 18, 2023
cf15af2
Reorder parameters
aekul Mar 18, 2023
79c4713
Fix long line
aekul Mar 18, 2023
50ba6e0
Remove empty clause
aekul Mar 18, 2023
6924d1a
Remove blank line
aekul Mar 18, 2023
5aad419
Uppercase enum, std::vector, constexpr
aekul Mar 26, 2023
5fab295
Remove HALIDE_ALLOW_LEGACY_AUTOSCHEDULER_API code
aekul Mar 31, 2023
766a010
Remove HALIDE_ALLOW_LEGACY_AUTOSCHEDULER_API code
aekul Mar 31, 2023
25da22a
Tidy up test input, move inner_extent
aekul Apr 2, 2023
98ee389
Merge remote-tracking branch 'upstream/main' into gpu-autoscheduler
aekul Apr 3, 2023
1a84237
clang-format
aekul Apr 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 31 additions & 21 deletions apps/cuda_mat_mul/mat_mul_generator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,28 +34,38 @@ class MatMul : public Halide::Generator<MatMul> {
Var xi, yi, xio, xii, yii, xo, yo, x_pair, xiio, ty;
RVar rxo, rxi;

out.bound(x, 0, size)
.bound(y, 0, size)
.tile(x, y, xi, yi, 64, 16)
.tile(xi, yi, xii, yii, 4, 8)
.gpu_blocks(x, y)
.gpu_threads(xi, yi)
.unroll(xii)
.unroll(yii);
prod.compute_at(out, xi)
.vectorize(x)
.unroll(y)
.update()
.reorder(x, y, r)
.vectorize(x)
.unroll(y)
.unroll(r, 8);
A.in().compute_at(prod, r).vectorize(_0).unroll(_1);
B.in().compute_at(prod, r).vectorize(_0).unroll(_1);
if (!using_autoscheduler()) {
out.bound(x, 0, size)
.bound(y, 0, size)
.tile(x, y, xi, yi, 64, 16)
.tile(xi, yi, xii, yii, 4, 8)
.gpu_blocks(x, y)
.gpu_threads(xi, yi)
.unroll(xii)
.unroll(yii);
prod.compute_at(out, xi)
.vectorize(x)
.unroll(y)
.update()
.reorder(x, y, r)
.vectorize(x)
.unroll(y)
.unroll(r, 8);
A.in().compute_at(prod, r).vectorize(_0).unroll(_1);
B.in().compute_at(prod, r).vectorize(_0).unroll(_1);

set_alignment_and_bounds(A, size);
set_alignment_and_bounds(B, size);
set_alignment_and_bounds(out, size);
set_alignment_and_bounds(A, size);
set_alignment_and_bounds(B, size);
set_alignment_and_bounds(out, size);
} else {
A.dim(0).set_estimate(0, size).dim(1).set_estimate(0, size);
B.dim(0).set_estimate(0, size).dim(1).set_estimate(0, size);
}

// Always specify bounds for outputs, whether autoscheduled or not
out
.bound(x, 0, size)
.bound(y, 0, size);
}
};

Expand Down
Binary file added apps/images/low_res_in.png
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be a duplicate of apps/images/rgb_small.png

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added apps/images/matrix_3200.mat
Binary file not shown.
Binary file added apps/images/matrix_7000.mat
Binary file not shown.
12 changes: 9 additions & 3 deletions src/Generator.h
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,9 @@ template<typename First, typename... Rest>
struct select_type : std::conditional<First::value, typename First::type, typename select_type<Rest...>::type> {};

template<typename First>
struct select_type<First> { using type = typename std::conditional<First::value, typename First::type, void>::type; };
struct select_type<First> {
using type = typename std::conditional<First::value, typename First::type, void>::type;
};

class GeneratorParamInfo;

Expand Down Expand Up @@ -2155,7 +2157,9 @@ class GeneratorInput_Arithmetic : public GeneratorInput_Scalar<T> {
};

template<typename>
struct type_sink { typedef void type; };
struct type_sink {
typedef void type;
};

template<typename T2, typename = void>
struct has_static_halide_type_method : std::false_type {};
Expand Down Expand Up @@ -3770,7 +3774,9 @@ class Generator : public Internal::GeneratorBase {
// std::is_member_function_pointer will fail if there is no member of that name,
// so we use a little SFINAE to detect if there are method-shaped members.
template<typename>
struct type_sink { typedef void type; };
struct type_sink {
typedef void type;
};

template<typename T2, typename = void>
struct has_configure_method : std::false_type {};
Expand Down
1 change: 1 addition & 0 deletions src/autoschedulers/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,4 @@ add_subdirectory(common)
add_subdirectory(adams2019)
add_subdirectory(li2018)
add_subdirectory(mullapudi2016)
add_subdirectory(anderson2021)
20 changes: 13 additions & 7 deletions src/autoschedulers/adams2019/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
# Build rules for the Adams2019 autoscheduler library
##

set(COMMON_DIR "${Halide_SOURCE_DIR}/src/autoschedulers/common")

# =================================================================
# weights
set(WF_CPP baseline.cpp)
Expand Down Expand Up @@ -75,9 +77,10 @@ endif ()
if (WITH_UTILS)
add_executable(adams2019_retrain_cost_model
DefaultCostModel.cpp
Weights.cpp
${COMMON_DIR}/Weights.cpp
retrain_cost_model.cpp
$<TARGET_OBJECTS:adams2019_weights_obj>)
target_include_directories(adams2019_retrain_cost_model PRIVATE "${Halide_SOURCE_DIR}/src/autoschedulers/adams2019")
target_link_libraries(adams2019_retrain_cost_model PRIVATE ASLog adams2019_cost_model adams2019_train_cost_model Halide::Halide Halide::Plugin)
endif ()

Expand All @@ -95,23 +98,25 @@ add_autoscheduler(
FunctionDAG.cpp
LoopNest.cpp
State.cpp
Weights.cpp
${COMMON_DIR}/Weights.cpp
$<TARGET_OBJECTS:adams2019_weights_obj>
)

target_include_directories(Halide_Adams2019 PRIVATE "${Halide_SOURCE_DIR}/src/autoschedulers/adams2019")
target_link_libraries(Halide_Adams2019 PRIVATE ASLog ParamParser adams2019_cost_model adams2019_train_cost_model)

# ====================================================
# Auto-tuning support utilities.
# TODO(#4053): implement auto-tuning support in CMake?

if (WITH_UTILS)
add_executable(adams2019_featurization_to_sample featurization_to_sample.cpp)
add_executable(adams2019_featurization_to_sample ${COMMON_DIR}/featurization_to_sample.cpp)

add_executable(adams2019_get_host_target get_host_target.cpp)
add_executable(adams2019_get_host_target ${COMMON_DIR}/get_host_target.cpp)
target_link_libraries(adams2019_get_host_target PRIVATE Halide::Halide)

add_executable(adams2019_weightsdir_to_weightsfile weightsdir_to_weightsfile.cpp Weights.cpp)
add_executable(adams2019_weightsdir_to_weightsfile ${COMMON_DIR}/weightsdir_to_weightsfile.cpp ${COMMON_DIR}/Weights.cpp)
target_include_directories(adams2019_weightsdir_to_weightsfile PRIVATE "${Halide_SOURCE_DIR}/src/autoschedulers/adams2019" ${COMMON_DIR})
target_link_libraries(adams2019_weightsdir_to_weightsfile PRIVATE Halide::Runtime)
endif ()

Expand All @@ -121,11 +126,12 @@ endif ()

if (WITH_TESTS)

add_executable(adams2019_test_perfect_hash_map test_perfect_hash_map.cpp)
add_executable(adams2019_test_perfect_hash_map ${COMMON_DIR}/test_perfect_hash_map.cpp)
add_test(NAME adams2019_test_perfect_hash_map COMMAND adams2019_test_perfect_hash_map)
set_tests_properties(adams2019_test_perfect_hash_map PROPERTIES LABELS "adams2019;autoschedulers;auto_schedule")

add_executable(adams2019_test_function_dag test_function_dag.cpp FunctionDAG.cpp)
add_executable(adams2019_test_function_dag ${COMMON_DIR}/test_function_dag.cpp FunctionDAG.cpp)
target_include_directories(adams2019_test_function_dag PRIVATE "${Halide_SOURCE_DIR}/src/autoschedulers/adams2019" ${COMMON_DIR})
target_link_libraries(adams2019_test_function_dag PRIVATE ASLog Halide::Halide Halide::Tools Halide::Plugin)
add_test(NAME adams2019_test_function_dag COMMAND adams2019_test_function_dag)
set_tests_properties(adams2019_test_function_dag PROPERTIES LABELS "adams2019;autoschedulers;auto_schedule")
Expand Down
22 changes: 11 additions & 11 deletions src/autoschedulers/adams2019/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,8 @@ $(BIN)/libautoschedule_adams2019.$(PLUGIN_EXT): \
$(SRC)/Cache.cpp \
$(SRC)/DefaultCostModel.h \
$(SRC)/DefaultCostModel.cpp \
$(SRC)/Weights.h \
$(SRC)/Weights.cpp \
$(COMMON_DIR)/Weights.h \
$(COMMON_DIR)/Weights.cpp \
$(SRC)/FunctionDAG.h \
$(SRC)/FunctionDAG.cpp \
$(SRC)/LoopNest.h \
Expand All @@ -79,38 +79,38 @@ $(BIN)/libautoschedule_adams2019.$(PLUGIN_EXT): \
$(SRC)/State.h \
$(SRC)/State.cpp \
$(SRC)/Timer.h \
$(SRC)/PerfectHashMap.h \
$(COMMON_DIR)/PerfectHashMap.h \
$(AUTOSCHED_WEIGHT_OBJECTS) \
$(AUTOSCHED_COST_MODEL_LIBS) \
$(BIN)/auto_schedule_runtime.a \
| $(LIB_HALIDE)
@mkdir -p $(@D)
$(CXX) -shared $(USE_EXPORT_DYNAMIC) -fPIC -fvisibility=hidden -fvisibility-inlines-hidden $(CXXFLAGS) $(OPTIMIZE) -I $(BIN)/cost_model $(filter-out %.h $(LIBHALIDE_LDFLAGS),$^) -o $@ $(HALIDE_SYSTEM_LIBS) $(HALIDE_RPATH_FOR_LIB)
$(CXX) -shared $(USE_EXPORT_DYNAMIC) -fPIC -fvisibility=hidden -fvisibility-inlines-hidden $(CXXFLAGS) $(OPTIMIZE) -I $(BIN)/cost_model $(filter-out %.h $(LIBHALIDE_LDFLAGS),$^) -o $@ $(HALIDE_SYSTEM_LIBS) $(HALIDE_RPATH_FOR_LIB) -I $(SRC)

$(BIN)/retrain_cost_model: $(SRC)/retrain_cost_model.cpp \
$(COMMON_DIR)/ASLog.cpp \
$(SRC)/DefaultCostModel.h \
$(SRC)/DefaultCostModel.cpp \
$(SRC)/Weights.h \
$(SRC)/Weights.cpp \
$(COMMON_DIR)/Weights.h \
$(COMMON_DIR)/Weights.cpp \
$(SRC)/CostModel.h \
$(SRC)/NetworkSize.h \
$(AUTOSCHED_COST_MODEL_LIBS) \
$(AUTOSCHED_WEIGHT_OBJECTS) \
$(BIN)/auto_schedule_runtime.a
@mkdir -p $(@D)
$(CXX) $(CXXFLAGS) -frtti -Wall -I ../support -I $(BIN)/cost_model $(OPTIMIZE) $(filter-out %.h,$^) -o $@ $(LIBHALIDE_LDFLAGS) $(USE_OPEN_MP) $(HALIDE_RPATH_FOR_BIN)
$(CXX) $(CXXFLAGS) -frtti -Wall -I ../support -I $(BIN)/cost_model $(OPTIMIZE) $(filter-out %.h,$^) -o $@ $(LIBHALIDE_LDFLAGS) $(USE_OPEN_MP) $(HALIDE_RPATH_FOR_BIN) -I $(SRC)

$(BIN)/featurization_to_sample: $(SRC)/featurization_to_sample.cpp
$(BIN)/featurization_to_sample: $(COMMON_DIR)/featurization_to_sample.cpp
@mkdir -p $(@D)
$(CXX) $(CXXFLAGS) $< $(OPTIMIZE) -o $@

$(BIN)/get_host_target: $(SRC)/get_host_target.cpp $(LIB_HALIDE) $(HALIDE_DISTRIB_PATH)/include/Halide.h
$(BIN)/get_host_target: $(COMMON_DIR)/get_host_target.cpp $(LIB_HALIDE) $(HALIDE_DISTRIB_PATH)/include/Halide.h
@mkdir -p $(@D)
$(CXX) $(CXXFLAGS) $(filter %.cpp,$^) $(LIBHALIDE_LDFLAGS) $(OPTIMIZE) -o $@ $(HALIDE_RPATH_FOR_BIN)
$(BIN)/weightsdir_to_weightsfile: $(SRC)/weightsdir_to_weightsfile.cpp $(SRC)/Weights.cpp
$(BIN)/weightsdir_to_weightsfile: $(COMMON_DIR)/weightsdir_to_weightsfile.cpp $(COMMON_DIR)/Weights.cpp
@mkdir -p $(@D)
$(CXX) $(CXXFLAGS) $^ $(OPTIMIZE) -o $@
$(CXX) $(CXXFLAGS) $^ $(OPTIMIZE) -o $@ -I $(SRC)

.PHONY: clean

Expand Down
Loading