Skip to content

C++ virtual-array implementation that uses all graphics cards in system as storage (with LRU cache eviction on RAM) and uses OpenCL for data transfers. (Random access: faster than HDD) (Sequential access: faster than SSD) (big objects: faster than NVMe)

License

Notifications You must be signed in to change notification settings

tugrul512bit/VirtualMultiArray

Repository files navigation

VirtualMultiArray

Multi graphics card based C++ virtual array implementation that uses VRAM with OpenCL to save gigabytes of RAM.

Wiki: https://github.com/tugrul512bit/VirtualMultiArray/wiki/How-it-works

This tool enables:

  • offloading some data to video ram to save disk from over-use
  • preserving responsiveness of system when more space than the available RAM is needed
    • with better latency than HDD/SSD
    • with higher throughput than NVME (requires multiple graphics cards)
  • finding an element in array by using GPU compute power

Simplest usage:

#include "GraphicsCardSupplyDepot.h"
#include "VirtualMultiArray.h"


// testing
#include <iostream>


int main(int argC, char ** argV)
{
	try
	{
		GraphicsCardSupplyDepot d;
		size_t numElements = 2000;
		int pageSize = 10;
		int activePagesPerGpuInstance = 5;

		// not just "int" but any POD of hundreds of kilobytes size too
		// A "Particle" object with 40-50 bytes is much more efficient for pcie data transfers, than an int, unless page size (cache line) is big enough
		VirtualMultiArray<int> intArr(numElements,d.requestGpus(),pageSize,activePagesPerGpuInstance);

		// thread-safe (scales up to 8 threads per logical core)
		// #pragma omp parallel for num_threads(64)
		for(size_t i=0;i<numElements;i++)
			intArr.set(i,i*2);
					
		for(size_t i=0;i<numElements;i++)
		{
			int var = intArr.get(i);
			std::cout<<var<<std::endl;
			
			// just a single-threaded-access optimization
			// to hide pcie latency of next page(LRU cache-line) of elements
			if((i<numElements-pageSize) && (i%pageSize==0) )
			{
				intArr.prefetch(i+pageSize); // asynchronously load next page into LRU
			}
		}
	}
	catch (std::exception& ex)
	{
		std::cout << ex.what();
	}
	return 0;
}

About

C++ virtual-array implementation that uses all graphics cards in system as storage (with LRU cache eviction on RAM) and uses OpenCL for data transfers. (Random access: faster than HDD) (Sequential access: faster than SSD) (big objects: faster than NVMe)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages