Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add non-blocking read/write transfers #68

Merged
merged 1 commit into from
Jun 9, 2020
Merged

Add non-blocking read/write transfers #68

merged 1 commit into from
Jun 9, 2020

Conversation

shomedawae
Copy link
Contributor

Transfer bandwidth test includes enqueueWriteBuffer and enqueueReadBuffer scenarios, both of these use blocking versions of OpenCL calls. On Intel it seems that some drivers might use CPU instead of GPU to perform the blocking transfer.

The possible solution is to add non-blocking versions of enqueueWriteBuffer and enqueueReadBuffer to the transfer bandwidth test.

As a verification, an output from running on idle machine:

clpeak.exe --transfer-bandwidth -d 1

Platform: Intel(R) OpenCL
  Device: Intel(R) HD Graphics 520
    Driver version  : 26.20.100.7212 (Win64)
    Compute units   : 24
    Clock frequency : 1000 MHz

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 4.65
      enqueueReadBuffer               : 4.62
      enqueueWriteBuffer non-blocking : 4.85
      enqueueReadBuffer non-blocking  : 4.78
      enqueueMapBuffer(for read)      : 252677.97
        memcpy from mapped ptr        : 4.63
      enqueueUnmap(after write)       : 338588.47
        memcpy to mapped ptr          : 4.18

and an output from running while an external app was fully utilizing the CPU:

clpeak.exe --transfer-bandwidth -d 1

Platform: Intel(R) OpenCL
  Device: Intel(R) HD Graphics 520
    Driver version  : 26.20.100.7212 (Win64)
    Compute units   : 24
    Clock frequency : 1000 MHz

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 3.03
      enqueueReadBuffer               : 3.10
      enqueueWriteBuffer non-blocking : 4.80
      enqueueReadBuffer non-blocking  : 4.74
      enqueueMapBuffer(for read)      : 302311.13
        memcpy from mapped ptr        : 3.13
      enqueueUnmap(after write)       : 513012.84
        memcpy to mapped ptr          : 3.33

As you can see, non-blocking transfers perform exactly the same in both runs.

@krrishnarraj krrishnarraj merged commit 4a5baa7 into krrishnarraj:master Jun 9, 2020
@krrishnarraj
Copy link
Owner

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants