cuburn/cuburn
Steven Robertson a439bf671d Fix occupancy issues (1 block/SM when shuffle was on).
There are 16 bar.sync() registers available per *chip*, not per block, and I
was using number 8 in the shuffle code. Evidently the driver rewrites them per
SM, but does not compact their range. Good to know.
2010-09-12 11:09:47 -04:00
..
__init__.py Rename "cuburnlib" (stupid) to "cuburn" (stupid but shorter) 2010-09-10 14:48:34 -04:00
cuda.py Use variations. This works, but is still fragile. 2010-09-11 13:15:36 -04:00
device_code.py Fix occupancy issues (1 block/SM when shuffle was on). 2010-09-12 11:09:47 -04:00
ptx.py Fix occupancy issues (1 block/SM when shuffle was on). 2010-09-12 11:09:47 -04:00
render.py Experiments with larger CTAs for IterThread 2010-09-12 02:01:03 -04:00
variations.py Add xforms and variations. 2010-09-11 13:10:41 -04:00