Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use natural dispatch syntax #246

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
c2482d1
This switches HIP from its currently convoluted macro + pfe based dis…
AlexVlx Nov 1, 2017
2cacda9
Correctly deal with functions from shared objects, wherein the progra…
AlexVlx Nov 1, 2017
bb11760
Merge remote-tracking branch 'origin/master' into feature_use_module_…
AlexVlx Nov 3, 2017
2932706
Merge remote-tracking branch 'origin/master' into feature_use_module_…
AlexVlx Nov 3, 2017
adaf6b8
Merge remote-tracking branch 'origin/master' into feature_use_module_…
AlexVlx Nov 7, 2017
3d24892
Merge remote-tracking branch 'origin/master' into feature_use_module_…
AlexVlx Nov 8, 2017
f7726cd
Merge remote-tracking branch 'origin/master' into feature_use_module_…
AlexVlx Nov 9, 2017
bd78459
Update new tests so as to make them work with new variadic based laun…
AlexVlx Nov 10, 2017
819e72f
Add omitted changes in CMakeLists.txt.
AlexVlx Nov 10, 2017
f93859c
Merge remote-tracking branch 'origin/master' into feature_use_module_…
AlexVlx Nov 16, 2017
c5f2b22
Re-sync with upstream.
AlexVlx Nov 20, 2017
7d5a45a
Correct ill-formed merge in earlier commit and adjust for differences…
AlexVlx Nov 20, 2017
1824fb7
Clean-up some remaining noise in program_state.cpp.
AlexVlx Nov 20, 2017
9d088d2
Refactor the __device__ versions of memset and memcpy to be less awkw…
AlexVlx Nov 21, 2017
5e16ee0
This corrects how addresses are formed for symbols which reside in sh…
AlexVlx Nov 21, 2017
4131b47
Modify the set component of the memcpy test (unclear why there is a m…
AlexVlx Nov 21, 2017
08f252e
Remove leftover comment.
AlexVlx Nov 22, 2017
9d47a4d
Add hipify mappings for all CUDA headers that have HIP equivalents
ChrisKitching Nov 13, 2017
a401ce6
This fixes some outright quaint choices made when implementing HIP's …
AlexVlx Nov 17, 2017
a6ccaf3
This actually (tries) to do the right thing all the way, by using mem…
AlexVlx Nov 18, 2017
265c3b2
Fix float2int rounding functions
mangupta Nov 23, 2017
dc67ca3
Merge remote-tracking branch 'origin/master' into feature_use_module_…
AlexVlx Nov 28, 2017
02c2bfc
Re-sync with upstream and re-factor platform global management for te…
AlexVlx Nov 28, 2017
6e4ca3f
Change memset kernel to use memcpy instead of placement new. Simplify…
AlexVlx Nov 28, 2017
5aeb5dc
Remove leftover agent allocated globals.
AlexVlx Nov 28, 2017
89e9399
Choose whether or not to use functional grid_launch based on the vers…
AlexVlx Nov 29, 2017
faa546d
Fix oversight in selection mechanism which led to erroneous code to b…
AlexVlx Nov 29, 2017
3ed8897
Add missing file.
AlexVlx Nov 29, 2017
b881cf7
Fix compiler version check.
AlexVlx Nov 29, 2017
d2fd1f5
Revert adoption of CUDA indexing in general - this can only work with…
AlexVlx Nov 29, 2017
fbaf729
Revert "Revert adoption of CUDA indexing in general - this can only w…
AlexVlx Nov 29, 2017
61453e1
Merge branch 'feature_use_module_based_dispatch_instead_of_pfe' of ht…
AlexVlx Nov 29, 2017
7acb1e6
Use a much simpler guard for version 1.6, which allows for direct CUD…
AlexVlx Nov 29, 2017
32e11e7
Revert "Revert adoption of CUDA indexing in general - this can only w…
AlexVlx Nov 29, 2017
20fc68c
Add missing space (the final frontier).
AlexVlx Nov 29, 2017
7c0b9a0
Fix legacy mode detection of the address of an agent allocated variab…
AlexVlx Nov 30, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ if(HIP_PLATFORM STREQUAL "hcc")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${HIP_HCC_BUILD_FLAGS}")

set(SOURCE_FILES_RUNTIME
src/code_object_bundle.cpp
src/hip_hcc.cpp
src/hip_context.cpp
src/hip_device.cpp
Expand All @@ -179,7 +180,8 @@ if(HIP_PLATFORM STREQUAL "hcc")
src/hip_db.cpp
src/grid_launch.cpp
src/hip_texture.cpp
src/env.cpp)
src/env.cpp
src/program_state.cpp)

set(SOURCE_FILES_DEVICE
src/device_util.cpp
Expand Down
158 changes: 158 additions & 0 deletions include/hip/hcc_detail/code_object_bundle.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
/*
Copyright (c) 2015 - present Advanced Micro Devices, Inc. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/

#pragma once

#include <hsa/hsa.h>

#include <algorithm>
#include <cstdint>
#include <istream>
#include <iterator>
#include <string>
#include <utility>
#include <vector>

namespace hip_impl
{
hsa_isa_t triple_to_hsa_isa(const std::string& triple);

struct Bundled_code {
union {
struct {
std::uint64_t offset;
std::uint64_t bundle_sz;
std::uint64_t triple_sz;
};
std::uint8_t cbuf[
sizeof(offset) + sizeof(bundle_sz) + sizeof(triple_sz)];
};
std::string triple;
std::vector<std::uint8_t> blob;
};

class Bundled_code_header {
// DATA - STATICS
static constexpr const char magic_string_[] =
"__CLANG_OFFLOAD_BUNDLE__";
static constexpr auto magic_string_sz_ = sizeof(magic_string_) - 1;

// DATA
union {
struct {
std::uint8_t bundler_magic_string_[magic_string_sz_];
std::uint64_t bundle_cnt_;
};
std::uint8_t cbuf_[
sizeof(bundler_magic_string_) + sizeof(bundle_cnt_)];
};
std::vector<Bundled_code> bundles_;

// FRIENDS - MANIPULATORS
template<typename RandomAccessIterator>
friend
inline
bool read(
RandomAccessIterator f,
RandomAccessIterator l,
Bundled_code_header& x)
{
if (f == l) return false;

std::copy_n(f, sizeof(x.cbuf_), x.cbuf_);

if (valid(x)) {
x.bundles_.resize(x.bundle_cnt_);

auto it = f + sizeof(x.cbuf_);
for (auto&& y : x.bundles_) {
std::copy_n(it, sizeof(y.cbuf), y.cbuf);
it += sizeof(y.cbuf);

y.triple.insert(y.triple.cend(), it, it + y.triple_sz);

std::copy_n(
f + y.offset, y.bundle_sz, std::back_inserter(y.blob));

it += y.triple_sz;
}

return true;
}

return false;
}
friend
inline
bool read(const std::vector<std::uint8_t>& blob, Bundled_code_header& x)
{
return read(blob.cbegin(), blob.cend(), x);
}
friend
inline
bool read(std::istream& is, Bundled_code_header& x)
{
return read(std::vector<std::uint8_t>{
std::istreambuf_iterator<char>{is},
std::istreambuf_iterator<char>{}},
x);
}

// FRIENDS - ACCESSORS
friend
inline
bool valid(const Bundled_code_header& x)
{
return std::equal(
x.bundler_magic_string_,
x.bundler_magic_string_ + magic_string_sz_,
x.magic_string_);
}
friend
inline
const std::vector<Bundled_code>& bundles(const Bundled_code_header& x)
{
return x.bundles_;
}
public:
// CREATORS
Bundled_code_header() = default;
template<typename RandomAccessIterator>
Bundled_code_header(RandomAccessIterator f, RandomAccessIterator l);
explicit
Bundled_code_header(const std::vector<std::uint8_t>& blob);
Bundled_code_header(const Bundled_code_header&) = default;
Bundled_code_header(Bundled_code_header&&) = default;
~Bundled_code_header() = default;

// MANIPULATORS
Bundled_code_header& operator=(const Bundled_code_header&) = default;
Bundled_code_header& operator=(Bundled_code_header&&) = default;
};

// CREATORS
template<typename I>
Bundled_code_header::Bundled_code_header(I f, I l) : Bundled_code_header{}
{
read(f, l, *this);
}
} // Namespace hip_impl.
2 changes: 1 addition & 1 deletion include/hip/hcc_detail/concepts.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
Copyright (c) 2015-2016 Advanced Micro Devices, Inc. All rights reserved.
Copyright (c) 2015-present Advanced Micro Devices, Inc. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
159 changes: 159 additions & 0 deletions include/hip/hcc_detail/functional_grid_launch.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
/*
Copyright (c) 2015 - present Advanced Micro Devices, Inc. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/

#pragma once

#include "code_object_bundle.hpp"
#include "concepts.hpp"
#include "helpers.hpp"
#include "program_state.hpp"

#include "hc.hpp"
#include "hip/hip_hcc.h"
#include "hip_runtime.h"

#include <cstddef>
#include <cstdint>
#include <functional>
#include <iostream>
#include <mutex>
#include <stdexcept>
#include <string>
#include <tuple>
#include <type_traits>
#include <unordered_map>
#include <utility>
#include <vector>

namespace hip_impl
{
template<
typename T,
typename std::enable_if<std::is_integral<T>{}>::type* = nullptr>
inline
T round_up_to_next_multiple_nonnegative(T x, T y)
{
T tmp = x + y - 1;
return tmp - tmp % y;
}

inline
std::vector<std::uint8_t> make_kernarg()
{
return {};
}

inline
std::vector<std::uint8_t> make_kernarg(std::vector<std::uint8_t> kernarg)
{
return kernarg;
}

template<typename T>
inline
std::vector<std::uint8_t> make_kernarg(std::vector<uint8_t> kernarg, T x)
{
kernarg.resize(
round_up_to_next_multiple_nonnegative(kernarg.size(), alignof(T)) +
sizeof(T));

new (kernarg.data() + kernarg.size() - sizeof(T)) T{std::move(x)};

return kernarg;
}

template<typename T, typename... Ts>
inline
std::vector<std::uint8_t> make_kernarg(
std::vector<std::uint8_t> kernarg, T x, Ts... xs)
{
return make_kernarg(
make_kernarg(std::move(kernarg), std::move(x)), std::move(xs)...);
}

template<typename... Ts>
inline
std::vector<std::uint8_t> make_kernarg(Ts... xs)
{
std::vector<std::uint8_t> kernarg;
kernarg.reserve(sizeof(std::tuple<Ts...>));

return make_kernarg(std::move(kernarg), std::move(xs)...);
}

void hipLaunchKernelGGLImpl(
std::uintptr_t function_address,
const dim3& numBlocks,
const dim3& dimBlocks,
std::uint32_t sharedMemBytes,
hipStream_t stream,
void** kernarg);
} // Namespace hip_impl.

template<typename... Args, typename F = void (*)(Args...)>
inline
void hipLaunchKernelGGL(
F kernel,
const dim3& numBlocks,
const dim3& dimBlocks,
std::uint32_t sharedMemBytes,
hipStream_t stream,
Args... args)
{
auto kernarg = hip_impl::make_kernarg(std::move(args)...);
std::size_t kernarg_size = kernarg.size();

void* config[] = {
HIP_LAUNCH_PARAM_BUFFER_POINTER, kernarg.data(),
HIP_LAUNCH_PARAM_BUFFER_SIZE, &kernarg_size,
HIP_LAUNCH_PARAM_END
};

hip_impl::hipLaunchKernelGGLImpl(
reinterpret_cast<std::uintptr_t>(kernel),
numBlocks,
dimBlocks,
sharedMemBytes,
stream,
&config[0]);
}

template<typename... Args, typename F = void (*)(hipLaunchParm, Args...)>
inline
void hipLaunchKernel(
F kernel,
const dim3& numBlocks,
const dim3& dimBlocks,
std::uint32_t groupMemBytes,
hipStream_t stream,
Args... args)
{
hipLaunchKernelGGL(
kernel,
numBlocks,
dimBlocks,
groupMemBytes,
stream,
hipLaunchParm{},
std::move(args)...);
}

Loading