I actually tried the same thing, because "C++ on CPU for complex rendering algorithms and shading and GPU for chewing through large intersection batches" sounded really nice.
Also, PCIE throughput seemed reasonable enough today to try this if you're not after real-time.
In practice (at least my approach) didn't work so well.
There was some improvement from using the GPU as an additional 'coprocessor', but it was in no way using the GPU to it's full potential.
I didn't use async transfers to the GPU as jbikker suggested, but at least the CPU was doing some parallel work while waiting for the results from the GPU.
Async or not, I found it very hard to keep the workload balanced, devices were constantly waiting for each other.
Another (smaller) problem was that you have huge memory consumption with large batches (it's not only the rays/results but also the 'interim results' you have to save for each ray/path to let the integrator continue after the results are there).
Maybe I just wasn't putting enough thought into it or doing something stupid, but I decided to ditch the approach (I'm mainly after interactivity, not flexibility and complex scenes).
What I'm trying now is writing the renderer (integrator/shader) as a C kernel and then compiling that to CUDA/OpenCL/ISPC.
Then there are intersection engines for each compilation target that allow the whole thing to run on a single device.
So, similar to what Cycles does but not in a Megakernel style, but with renderer kernels and intersection kernels separated.
You can then use multiple devices to render a single image, but the devices are more decoupled and can run more concurrently (it's actually almost the same as cluster rendering over the network).
This is all in the early stages and I don't have any reliable numbers, but it seems to works much better.
Of course, you can't use all the nice C++ features and existing code.
(Which is really sad, because most of this stuff is just a compilation problem in the end -- not having virtual, templates, and operator overloading can be quite annoying...)
Well, after reading this again I realize this little experience report probably won't help you much with your actual problem, but you said any comment was welcome.