Steven Robertson a439bf671d Fix occupancy issues (1 block/SM when shuffle was on).
There are 16 bar.sync() registers available per *chip*, not per block, and I
was using number 8 in the shuffle code. Evidently the driver rewrites them per
SM, but does not compact their range. Good to know.
2010-09-12 11:09:47 -04:00
2010-09-09 11:36:14 -04:00
Description
PyCUDA implementation of a GPU-accelerated fractal flame renderer.
2.3 MiB
Languages
Python 92.8%
Cuda 6%
Shell 0.6%
C 0.6%