Sunday, March 11, 2012

Make Optix to render with OpenGL together


Optix is a ray tracing framework, it can generate realistic images in real-time. While some times we need to combine OpenGL and Optix to render the same image, like when the application is using some OpenGL based rendering engine, just adopt Optix to enhance partial effects of the whole scene. Additionally, because of the low efficiency of BVH updating, inspite of the great advances of the accelerators in Optix 2.5, ray tracing is not good at rendering deformable objects these days.

As you can imagine, to make Optix and OpenGL work together, color buffers and depth buffer both need their special treatments.

Color buffer is easy to process. We can just declare an output buffer to save the results of Optix and read it back after the ray tracing finished. Then a common OpenGL operation could work, like attaching the buffer to a GL texture and use it in the rendering pipeline, or just filling the OpenGL rendering color buffer by glDrawPixels. Acutally most Optix SDK examples did like this.

Depth buffer combination may be the most important and a bit complicated. As a ray tracing engine, Optix need not to do depth buffer test, so one can only find the rtIntersectionDistance, which means the distance from the ray origin to current ray-surface intersection point. So handily generate an OpenGL compliant depth buffer is the first problem. A useful reference is http://www.songho.ca/opengl/gl_projectionmatrix.html

My realization of the depth value construction is  attached as below:

// eyeDist:  distance from eye to the intersection point.
// n:           near clipping plane
// f:            far clipping plane
__device__ float computeClipDepth( float eyeDist, float n, float f )
{
    float clipDepth = (f+n)/(f-n) - (1/eyeDist)*2.0f*f*n/(f-n);
    clipDepth = clipDepth*0.5 + 0.5f;
    return clipDepth;
}

The second problem is to use the generated depth buffer of Optix into OpenGL. Actually it is totally OpenGL operations. But maybe its not a daily used process like draw a triangle or shading a scene object, so there is little resource could be found on the web.
My realization of the depth value construction is  also attached as below, where depthImg contains per pixel depth value, coloredImg contains per pixel color value.

glPushAttrib(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glPushClientAttrib(GL_CLIENT_PIXEL_STORE_BIT);
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);


glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
glDepthMask(GL_FALSE);    
glWindowPos2i(0, 0);
glDrawPixels(w, h, GL_RGBA , GL_FLOAT, coloredImg);


glColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE);
glDepthMask(GL_TRUE);
glDepthFunc(GL_ALWAYS);
glWindowPos2i(0, 0);
glDrawPixels(w, h, GL_DEPTH_COMPONENT , GL_FLOAT, depthImg);


glPopClientAttrib();
glPopAttrib(); // GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT

A test scene screenshot is as below, where the transparent balls and cube are rendered by Optix, the frame and axis are rendered by OpenGL. Attension the z-buffer based depth test between the balls, box and the axis.


Saturday, March 10, 2012

InterOp between CUDA and Optix with PBO

CUDA and Optix can both read/write OpenGL PBO, so it is the first choice when you want to share data between them. Below is the class I used to set up shared PBO between CUDA and Optix. It creates an OpenGL PBO and also registers it to CUDA. Only the size of the buffer is needed, and it can be converted into suitable type at any time. class CudaOptixSharedBuffer { protected: unsigned int _pbo; size_t _devBufSize; void* _devBufAddr; struct cudaGraphicsResource *_pbo_resource; public: CudaOptixSharedBuffer(){}; ~CudaOptixSharedBuffer(void) {}; void createCudaPbo(size_t sizeByte) { glGenBuffers( 1, &_pbo ); glBindBuffer( GL_PIXEL_UNPACK_BUFFER, _pbo ); glBufferData( GL_PIXEL_UNPACK_BUFFER, sizeByte, NULL, GL_STREAM_DRAW ); glBindBuffer( GL_PIXEL_UNPACK_BUFFER, 0 ); cutilSafeCall(cudaGraphicsGLRegisterBuffer(&_pbo_resource, _pbo, cudaGraphicsMapFlagsNone)); } void releaseCudaPbo() { cutilSafeCall(cudaGraphicsUnregisterResource(_pbo_resource)); glDeleteBuffers(1, &_pbo); _pbo = 0; } unsigned int getPbo() { return _pbo; } void* preCudaOp() { cutilSafeCall(cudaGraphicsMapResources(1, &_pbo_resource, 0)); cutilSafeCall(cudaGraphicsResourceGetMappedPointer(_devBufAddr, _devBufSize, _pbo_resource)); return _devBufAddr; } void postCudaOp() { cutilSafeCall(cudaGraphicsUnmapResources(1, &_pbo_resource, 0)); }
}; // CudaOptixSharedBuffer Optix is very sensitive to the data types, that means when we want to share this buffer with Optix, we need to specify the data element type and size. Some thing could be like this:
struct MyData
{
float var;
};


unsigned int elemNum = 1024; // number of elements.
CudaOptixSharedBuffer _coRayBuffer; _coRayBuffer.createCudaPbo(ElementNumber * sizeofElement); _coRayBuffer.getPbo());

optix::Buffer buffer = optixContext->createBufferFromGLBO(RT_BUFFER_INPUT, buffer->setFormat(RT_FORMAT_USER); buffer->setElementSize(sizeof(MyData));
buffer->setSize(elemNum); optixContext["OptixBufferName"]->setBuffer(buffer); Here we use a user defined data structure, so ‘setElementSize’ is necessary, the input of it should be the byte size of MyData.


Another interest thing is even we set a shared buffer as RT_BUFFER_INPUT, we can still read it out by cudaMemcpy. In other worlds, the shared buffer can be always readable and writable, no matter what is specified in ‘createBufferFromGLBO’.