-
Notifications
You must be signed in to change notification settings - Fork 3
System requirements
Hüseyin Tuğrul BÜYÜKIŞIK edited this page Feb 18, 2021
·
13 revisions
-
At least one graphics card that supports OpenCL 1.2(for future features) (currently uses only OpenCL v1.0 features) and has a dedicated VRAM. Any combo including Nvdia+Amd+Intel should work.
-
- In Windows, due to WDDM overhead, Nvidia cards' i/o performance is lowered but with TCC driver mode enabled from nvidia-smi (Some Quadro - Tesla cards have this) it can retain some of performance (currently Ubuntu benchmarks have better results)
-
Currently it supports Windows and Ubuntu
-
C++14 compiler option enabled (C++1y dialect for g++ compiler)
-
for the multi-threaded benchmark in main.cpp, OpenMP is needed ("gomp" library for g++ linker)
-
Some RAM that can hold active pages
-
- total RAM consumed by active pages = (number of active pages per gpu instance) * (number of gpus) * (4(or custom num with
memMult
) instances per gpu) * (page size) * sizeof(your_object)
- total RAM consumed by active pages = (number of active pages per gpu instance) * (number of gpus) * (4(or custom num with
-
- if you have 10 gpus, page size=1024, active pages per instance = 100, object size = 100 bytes, then 409MB of RAM will be used
-
-
- so be careful when using
memMult
parameter (like{n1,n2,n3,..}
),{50,50,50}
means 150 gpu instances
- so be careful when using
-
-
- VRAM usage only changes by number of elements of array (100M elements * 100 bytes per object = 10GB)
-
-
- Equally distributed between graphics cards
-
-
-
-
- Or distributed with a ratio as described by
memMult
parameter:{100,10,1}
means first card serves 100x vram, second serves 10x vram, last card serves 1x vram where 111x = total VRAM usage. (same as pcie bandwidth limiting 111x = total pcie bandwidth)
- Or distributed with a ratio as described by
-
-