-
Notifications
You must be signed in to change notification settings - Fork 48
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'aomp-epsdb' into aomp-epsdb-mainline
Change-Id: I95a6ab1882780ef432c210ced4b65b4b9110eabe
- Loading branch information
Showing
6 changed files
with
668 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
OMPT target support: Examples to demonstrate how a tool would use the OMPT target APIs | ||
======================================================================================= | ||
|
||
The examples simulate how a tool is expected to use OMPT target | ||
support. The tool would register callbacks and call OMPT runtime entry | ||
points to start and stop device tracing, if required. The tool would | ||
have an OpenMP thread call these runtime entry points to control | ||
device tracing. When certain events occur, the OpenMP runtime would | ||
invoke the event callbacks so that the tool can establish the event | ||
context. If device tracing has been requested, the OpenMP runtime | ||
would collect and manage trace records in buffers. When a buffer fills | ||
up or if an OpenMP thread requests explicit flushing of trace records, | ||
an OpenMP runtime helper thread would invoke a buffer-completion | ||
callback. The buffer-completion callback is implemented by the tool | ||
and would typically traverse the trace records returned as part of the | ||
callback. Once the trace records are returned, they can be correlated | ||
to the context established earlier through the event callbacks. | ||
|
||
Here are the steps: | ||
(1) The tool has to define a function called ompt_start_tool with | ||
C-linkage and the appropriate signature as defined by the OpenMP | ||
spec. This function provides 2 function pointers as part of the | ||
returned object, one for an initialization function and the other for | ||
a finalization function. | ||
|
||
(2) The tool has to define the initialization and the finalization | ||
functions referred to above. The initialization function is invoked by | ||
the OpenMP runtime with an input lookup parameter. Typically, the | ||
initialization function would use the lookup parameter to obtain a | ||
handle to the function ompt_set_callback that is implemented by the | ||
OpenMP runtime. Using this handle, the tool can then register | ||
callbacks. In our examples for OMPT target, some common callbacks | ||
registered include device initialization, data transfer operations, | ||
and target submit. | ||
|
||
(3) The device initialize callback, implemented by the tool, is | ||
invoked by the OpenMP device plugin runtime during device | ||
initialization with a lookup parameter. This callback would look up | ||
entry points (such as ompt_start_trace) for device tracing so that the | ||
tool can control the regions that should be traced. | ||
|
||
(4) The ompt_start_trace entry point expects 2 function pointers, one | ||
for an allocation function that will be invoked by the OpenMP runtime | ||
for allocating space for trace record buffers. The other one is a | ||
buffer-completion callback function that will be invoked by an OpenMP | ||
runtime helper thread for returning trace records to the tool. The | ||
tool is expected to use the entry point, ompt_get_record_ompt, to | ||
inspect a trace record at a given cursor and the entry point, | ||
ompt_advance_buffer_cursor, to traverse the returned trace records. | ||
|
||
(5) If device tracing is desired, calls to entry points, | ||
ompt_set_trace_ompt, ompt_start_trace, ompt_flush_trace, and | ||
ompt_stop_trace will be injected into the OpenMP program by the tool | ||
to control the type and region of tracing. |
129 changes: 129 additions & 0 deletions
129
examples/tools/ompt/veccopy-ompt-target-tracing/Makefile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
#----------------------------------------------------------------------- | ||
# | ||
# Makefile: Cuda clang demo Makefile for both amdgcn and nvptx targets. | ||
# amdgcn GPU targets begin with "gfx". nvptx targets begin | ||
# with sm_. Example: To build and run on k4000 do this: | ||
# | ||
# export AOMP_GPU=sm_30 | ||
# make run | ||
# | ||
# Run "make help" to see other options for this Makefile | ||
|
||
TESTNAME = veccopy-ompt-target-tracing | ||
TESTSRC = veccopy-ompt-target-tracing.c | ||
|
||
UNAMEP = $(shell uname -m) | ||
AOMP_CPUTARGET = $(UNAMEP)-pc-linux-gnu | ||
ifeq ($(UNAMEP),ppc64le) | ||
AOMP_CPUTARGET = ppc64le-linux-gnu | ||
endif | ||
|
||
# --- Standard Makefile check for AOMP installation --- | ||
ifeq ("$(wildcard $(AOMP))","") | ||
ifneq ($(AOMP),) | ||
$(warning AOMP not found at $(AOMP)) | ||
endif | ||
AOMP = $(HOME)/rocm/aomp | ||
ifeq ("$(wildcard $(AOMP))","") | ||
$(warning AOMP not found at $(AOMP)) | ||
AOMP = /usr/lib/aomp | ||
ifeq ("$(wildcard $(AOMP))","") | ||
$(warning AOMP not found at $(AOMP)) | ||
$(error Please install AOMP or correctly set env-var AOMP) | ||
endif | ||
endif | ||
endif | ||
# --- End Standard Makefile check for AOMP installation --- | ||
INSTALLED_GPU = $(shell $(AOMP)/bin/mygpu -d gfx900)# Default AOMP_GPU is gfx900 which is vega | ||
AOMP_GPU ?= $(INSTALLED_GPU) | ||
CC = $(AOMP)/bin/clang | ||
|
||
ifeq (sm_,$(findstring sm_,$(AOMP_GPU))) | ||
AOMP_GPUTARGET = nvptx64-nvidia-cuda | ||
else | ||
AOMP_GPUTARGET = amdgcn-amd-amdhsa | ||
endif | ||
|
||
# Sorry, clang openmp requires these complex options | ||
CFLAGS = -O3 -target $(AOMP_CPUTARGET) -fopenmp -fopenmp-targets=$(AOMP_GPUTARGET) -Xopenmp-target=$(AOMP_GPUTARGET) -march=$(AOMP_GPU) | ||
|
||
ifeq ($(OFFLOAD_DEBUG),1) | ||
$(info DEBUG Mode ON) | ||
CCENV = env LIBRARY_PATH=$(AOMP)/lib-debug | ||
RUNENV = LIBOMPTARGET_DEBUG=1 | ||
endif | ||
|
||
ifeq ($(VERBOSE),1) | ||
$(info Compilation VERBOSE Mode ON) | ||
CFLAGS += -v | ||
endif | ||
|
||
ifeq ($(TEMPS),1) | ||
$(info Compilation and linking save-temp Mode ON) | ||
CFLAGS += -save-temps | ||
endif | ||
|
||
ifeq (sm_,$(findstring sm_,$(AOMP_GPU))) | ||
CUDA ?= /usr/local/cuda | ||
LFLAGS += -L$(CUDA)/targets/$(UNAMEP)-linux/lib -lcudart | ||
endif | ||
|
||
CFLAGS += $(EXTRA_CFLAGS) | ||
|
||
# ----- Demo compile and link in one step, no object code saved | ||
$(TESTNAME): $(TESTSRC) | ||
$(CCENV) $(CC) $(CFLAGS) $(LFLAGS) $^ -o $@ | ||
|
||
run: $(TESTNAME) | ||
$(RUNENV) ./$(TESTNAME) | ||
|
||
# ---- Demo compile and link in two steps, object saved | ||
$(TESTNAME).o: $(TESTSRC) | ||
$(CCENV) $(CC) -c $(CFLAGS) $^ -o $@ | ||
|
||
obin: $(TESTNAME).o | ||
$(CCENV) $(CC) $(CFLAGS) $(LFLAGS) $^ -o $@ | ||
|
||
run_obin: obin | ||
$(RUNENV) ./obin | ||
|
||
help: | ||
@echo | ||
@echo "Source[s]: $(TESTSRC)" | ||
@echo "Application binary: $(TESTNAME)" | ||
@echo "Target GPU: $(AOMP_GPU)" | ||
@echo "Target triple: $(AOMP_GPUTARGET)" | ||
@echo "AOMP compiler: $(CC)" | ||
@echo "Compile flags: $(CFLAGS)" | ||
ifeq (sm_,$(findstring sm_,$(AOMP_GPU))) | ||
@echo "CUDA installation: $(CUDA)" | ||
endif | ||
@echo | ||
@echo "This Makefile supports these targets:" | ||
@echo | ||
@echo " make // Builds $(TESTNAME) " | ||
@echo " make run // Executes $(TESTNAME) " | ||
@echo | ||
@echo " make $(TESTNAME).o // build object file " | ||
@echo " make obin // Link object file to build binary " | ||
@echo " make run_obin // Execute obin " | ||
@echo | ||
@echo " make clean" | ||
@echo " make help" | ||
@echo | ||
@echo "Environment variables used by this Makefile:" | ||
@echo " AOMP_GPU=<GPU> Target GPU, e.g sm_30, default=gfx900. To build for" | ||
@echo " Nvidia GPUs, set AOMP_GPU=sm_60 or appropriate sm_" | ||
@echo " AOMP=<dir> AOMP install dir, default=/usr/lib/aomp" | ||
@echo " EXTRA_CFLAGS=<args> extra arguments for compiler" | ||
@echo " OFFLOAD_DEBUG=n if n=1, compile and run in Debug mode" | ||
@echo " VERBOSE=n if n=1, add verbose output" | ||
@echo " TEMPS=1 do not delete intermediate files" | ||
ifeq (sm_,$(findstring sm_,$(AOMP_GPU))) | ||
@echo " CUDA=<dir> CUDA install dir, default=/usr/local/cuda" | ||
endif | ||
@echo | ||
|
||
# Cleanup anything this makefile can create | ||
clean: | ||
rm -f $(TESTNAME) obin *.i *.ii *.bc *.lk a.out-* *.ll *.s *.o *.cubin |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
An illustration of how a tool would use OMPT target support for a | ||
simple vector copy OpenMP program. | ||
|
||
To compile and run: | ||
make run | ||
|
||
For help: | ||
make help | ||
|
||
Example output is in example_run.log. |
Oops, something went wrong.