Saturday, March 10, 2012

InterOp between CUDA and Optix with PBO

CUDA and Optix can both read/write OpenGL PBO, so it is the first choice when you want to share data between them. Below is the class I used to set up shared PBO between CUDA and Optix. It creates an OpenGL PBO and also registers it to CUDA. Only the size of the buffer is needed, and it can be converted into suitable type at any time. class CudaOptixSharedBuffer { protected: unsigned int _pbo; size_t _devBufSize; void* _devBufAddr; struct cudaGraphicsResource *_pbo_resource; public: CudaOptixSharedBuffer(){}; ~CudaOptixSharedBuffer(void) {}; void createCudaPbo(size_t sizeByte) { glGenBuffers( 1, &_pbo ); glBindBuffer( GL_PIXEL_UNPACK_BUFFER, _pbo ); glBufferData( GL_PIXEL_UNPACK_BUFFER, sizeByte, NULL, GL_STREAM_DRAW ); glBindBuffer( GL_PIXEL_UNPACK_BUFFER, 0 ); cutilSafeCall(cudaGraphicsGLRegisterBuffer(&_pbo_resource, _pbo, cudaGraphicsMapFlagsNone)); } void releaseCudaPbo() { cutilSafeCall(cudaGraphicsUnregisterResource(_pbo_resource)); glDeleteBuffers(1, &_pbo); _pbo = 0; } unsigned int getPbo() { return _pbo; } void* preCudaOp() { cutilSafeCall(cudaGraphicsMapResources(1, &_pbo_resource, 0)); cutilSafeCall(cudaGraphicsResourceGetMappedPointer(_devBufAddr, _devBufSize, _pbo_resource)); return _devBufAddr; } void postCudaOp() { cutilSafeCall(cudaGraphicsUnmapResources(1, &_pbo_resource, 0)); }
}; // CudaOptixSharedBuffer Optix is very sensitive to the data types, that means when we want to share this buffer with Optix, we need to specify the data element type and size. Some thing could be like this:
struct MyData
{
float var;
};


unsigned int elemNum = 1024; // number of elements.
CudaOptixSharedBuffer _coRayBuffer; _coRayBuffer.createCudaPbo(ElementNumber * sizeofElement); _coRayBuffer.getPbo());

optix::Buffer buffer = optixContext->createBufferFromGLBO(RT_BUFFER_INPUT, buffer->setFormat(RT_FORMAT_USER); buffer->setElementSize(sizeof(MyData));
buffer->setSize(elemNum); optixContext["OptixBufferName"]->setBuffer(buffer); Here we use a user defined data structure, so ‘setElementSize’ is necessary, the input of it should be the byte size of MyData.


Another interest thing is even we set a shared buffer as RT_BUFFER_INPUT, we can still read it out by cudaMemcpy. In other worlds, the shared buffer can be always readable and writable, no matter what is specified in ‘createBufferFromGLBO’.

1 comment:

chunyun said...

Hi, Thanks for the help of this article, I managed to create a interop buffer. Did you try to do this on multiple GPU? Any idea how to do this?