cuburn

mirror of https://github.com/stevenrobertson/cuburn.git synced 2026-07-21 11:13:16 -04:00

Files

T

Steven Robertson 094890c324 Use shared memory for iter_count and have each CP processed by only one CTA.

Slower, but the code is a bit simpler conceptually, and the difference will be
more than accounted for by better scheduling towards the end of the process.

2010-09-07 14:54:50 -04:00

__init__.py

Splitting things up a bit

2010-08-28 16:56:05 -04:00

cuda.py

Use shared memory for iter_count and have each CP processed by only one CTA.

2010-09-07 14:54:50 -04:00

device_code.py

Use shared memory for iter_count and have each CP processed by only one CTA.

2010-09-07 14:54:50 -04:00

ptx.py

Use shared memory for iter_count and have each CP processed by only one CTA.

2010-09-07 14:54:50 -04:00

render.py

Use shared memory for iter_count and have each CP processed by only one CTA.

2010-09-07 14:54:50 -04:00