PyCUDA implementation of a GPU-accelerated fractal flame renderer.
Go to file
Steven Robertson 618b51b1b1 Speed enhancement: alpha packing.
When the alpha channel is used in a color palette, the code now replaces
the blue channel in the accumulation buffer with a pair of two U16s,
which encode the values of the blue and alpha channels as a fraction of
the value of the density. When the alpha channel is always 1.0, the blue
channel works as normal. Density is now always the last element in the
accumulation buffer.

Eliminating the separate IO operations improved total runtime by more
than 30% on my card, while the extra calculations reduced that to 20%
when alpha was present (though that can be optimized further).
2011-10-11 09:57:37 -04:00
cuburn Speed enhancement: alpha packing. 2011-10-11 09:57:37 -04:00
helpers Use much more accurate filtsum estimation polynomials 2011-06-12 17:37:57 -04:00
bench.py Simultaneous occupancy microbenchmark 2010-09-12 16:23:24 -04:00
main.py Add quick debug option 2011-10-03 17:10:38 -04:00
README.md Add README 2011-10-03 17:37:32 -04:00
sortbench.cu Done. The Boost version is much faster, alas. 2011-08-31 13:24:44 -04:00
sortbench.py Done. The Boost version is much faster, alas. 2011-08-31 13:24:44 -04:00

Cuburn

This project is a fractal flame renderer. It is still under development, but already implements most of the genome parameters that flam3 supports, and beats CPU by a healthy margin (20-40x in most cases).

This project is licensed under the GPL version 3.