Commit Graph

241 Commits

Author SHA1 Message Date
Steven Robertson
6c2df777b0 Remove a TODO 2011-10-16 13:52:01 -04:00
Steven Robertson
8ce2470dfb Relax FUSE a little (no visible impact so far) 2011-10-16 13:45:27 -04:00
Steven Robertson
c4ce3cf4c2 Don't crash on empty render(times) 2011-10-16 13:44:22 -04:00
Steven Robertson
0cc904c4f1 Do post affine transforms. How did I miss this? 2011-10-16 13:43:46 -04:00
Steven Robertson
5111a0f05c Eliminate needless pre_ var separation 2011-10-16 13:42:37 -04:00
Steven Robertson
9bafbda81a Refactor host rendering code for better load 2011-10-15 22:22:43 -04:00
Steven Robertson
9ff018de87 Actually fix dithering. (I've seen this before...) 2011-10-15 19:08:16 -04:00
Steven Robertson
63483480d0 Bias the radius to avoid very large dither offsets 2011-10-15 00:50:24 -04:00
Steven Robertson
3be14547ea Use 3*256 instead of 2*512 blocks; faster on GF104 2011-10-15 00:33:37 -04:00
Steven Robertson
c7728d3507 Add faster no-L1 accum 2011-10-15 00:32:30 -04:00
Steven Robertson
dd645bcbf6 Use one dither offset per block. 2011-10-15 00:29:22 -04:00
Steven Robertson
83670df2c7 Fix random seeds. 2011-10-14 11:56:58 -04:00
Steven Robertson
b168a2431e 32-bit compatibility (I think?) 2011-10-13 16:56:20 -04:00
Steven Robertson
14872ee6ed Add --sleep for slightly more usable system 2011-10-13 16:55:26 -04:00
Steven Robertson
e6e2c4a8d7 Add --sync option. 2011-10-13 07:53:55 -04:00
Steven Robertson
4834c9fdfa Change synchronization model. 2011-10-12 14:08:13 -04:00
Steven Robertson
81f61d4d5d Improve asynchrony; improve palette interp perf. 2011-10-12 14:07:28 -04:00
Steven Robertson
7b9bb165ac Change the way compile options are handled 2011-10-12 14:02:32 -04:00
Steven Robertson
f04ad7ab68 Performance improvements in Genome.__init__() 2011-10-12 13:57:43 -04:00
Steven Robertson
0f615bd98b Performance improvements in affine helpers 2011-10-12 13:56:34 -04:00
Steven Robertson
d409f02e4a Precompile accessors.
This improves packing speed by 8x, which is visible on small or
low-quality frames.
2011-10-12 11:50:07 -04:00
Steven Robertson
a12714f4c4 Fix MWC test 2011-10-12 07:36:07 -04:00
Steven Robertson
9b03f557c2 Fix missing control points in async version.
The allocation pool was reallocating the same frame as soon as it had
left the current scope, before it had been copied. We just reallocate
the same chunks. I don't think this has any real performance impact but
this can be verified.
2011-10-11 20:54:33 -04:00
Steven Robertson
b081bc9378 Remove a sync from iter.
A small but consistent improvement.
2011-10-11 14:56:23 -04:00
Steven Robertson
095936666e Actually asynchronous rendering.
This change didn't affect GPU performance at all, but it did improve CPU
startup time, and should also improve time for long-running animations.
2011-10-11 11:27:40 -04:00
Steven Robertson
8c7e86c7c7 Fixed fraction to not exceed range 2011-10-11 11:26:38 -04:00
Steven Robertson
618b51b1b1 Speed enhancement: alpha packing.
When the alpha channel is used in a color palette, the code now replaces
the blue channel in the accumulation buffer with a pair of two U16s,
which encode the values of the blue and alpha channels as a fraction of
the value of the density. When the alpha channel is always 1.0, the blue
channel works as normal. Density is now always the last element in the
accumulation buffer.

Eliminating the separate IO operations improved total runtime by more
than 30% on my card, while the extra calculations reduced that to 20%
when alpha was present (though that can be optimized further).
2011-10-11 09:57:37 -04:00
Steven Robertson
46c6074b92 Use C++ pass-by-reference to explicitly share. 2011-10-03 16:53:29 -04:00
Erik Reckase
851980f682 mobius d params were missing 2011-07-06 12:47:03 -06:00
Steven Robertson
18a60ec066 Major bugfix. Also include thread-swapping that works. 2011-06-25 20:37:08 -04:00
Erik Reckase
44f897f28e fixed enable/disable chaos 'if chain' in iter kernel function 2011-06-24 09:59:14 -06:00
Erik Reckase
b732a3c244 now the chaos 'if chain' is only used if there are non-unity chaos entries in the genome. 2011-06-24 08:18:08 -06:00
Erik Reckase
50b664b1f9 chaos support \0/ 2011-06-24 06:09:04 -06:00
Erik Reckase
bc2aa00e2a removed stray debug printfs 2011-06-21 11:24:06 -06:00
Erik Reckase
addad052b1 closes 11 - all black pixels with non-zero density were not being handled properly. 2011-06-21 11:22:20 -06:00
Erik Reckase
981de94be5 added <stdio.h> for printf support in cuda code 2011-06-21 11:17:11 -06:00
Erik Reckase
746185ce4d added support for pre_blur. all variations that start with 'pre_' will be applied to the
output of the affine transform before the other variation contributions are calculated.
2011-06-20 14:05:00 -06:00
Steven Robertson
c66cb463d4 Add background color support, and tentatively disable density blurring. 2011-06-19 00:30:54 -04:00
Steven Robertson
883de380fc Did check. It is right. 2011-06-18 22:30:09 -04:00
Erik Reckase
98fb376545 fixed extra ) in var59 2011-06-17 16:36:27 -06:00
Erik Reckase
f684f90956 fixed a few more variations 2011-06-17 13:00:57 -06:00
Erik Reckase
3ee437d9b2 more fixes for variations...just about have all of the written ones validated. 2011-06-17 10:24:13 -06:00
Erik Reckase
6cd4f328f0 fixes for various variations 2011-06-16 21:25:06 -06:00
Steven Robertson
9e74ff57ce Fix julia variation. Closes issue 10. 2011-06-16 13:42:17 -04:00
Erik Reckase
8a3365712c fixed super_shape 2011-06-16 10:23:47 -06:00
Erik Reckase
e05d43fc57 fixed pie variation. 2011-06-16 06:50:26 -06:00
Erik Reckase
e83e67b440 fixed waves variation. 2011-06-16 05:34:46 -06:00
Erik Reckase
842efb6317 more variation fixes, syntax errors and so on. 2011-06-15 20:21:40 -06:00
Steven Robertson
702e303509 Gaussian dither 2011-06-13 23:20:39 -04:00
Steven Robertson
e579c837ce Missed a double in the filtering 2011-06-13 00:50:41 -04:00
Steven Robertson
5ebf62b1a3 Reduce some double-precision constants 2011-06-13 00:48:31 -04:00
Erik Reckase
131ce96263 fixed some missing ; in the variation code. 2011-06-12 21:53:33 -06:00
Steven Robertson
ae914d0b81 Fix some animation bugs 2011-06-12 20:20:36 -04:00
Steven Robertson
89b6732752 Skip the final xform when (re)joining the attractor 2011-06-12 19:29:10 -04:00
Steven Robertson
9a8e57cbc6 Fix a type error when gamma linearization == 0 2011-06-12 19:18:47 -04:00
Steven Robertson
9f2bc49009 Clean up some leftover debugging code in filtering 2011-06-12 19:17:02 -04:00
Steven Robertson
f872baf844 Use much more accurate filtsum estimation polynomials 2011-06-12 17:37:57 -04:00
Steven Robertson
e9998c28da Re-add a hard clamp for an estimator minimum.
The value 0.4 is above what it should be (0.3 is the theoretical minimum), and
having the harder clamp threshold causes some problems, but fixes others.
There's a deeper bug here.
2011-06-11 23:39:51 -04:00
Steven Robertson
299b5d5dab Fix filtering - or at least make it less broken 2011-06-11 23:28:32 -04:00
Steven Robertson
6b09e162a3 Make DE use current center CP instead of anim-wide CP; start debugging color 2011-06-11 22:51:16 -04:00
Steven Robertson
7ff0b65d81 Fix improper gutter offset in camera computation 2011-06-11 17:58:08 -04:00
Steven Robertson
38fbc391e8 Add gamma linearization (may be incorrect) 2011-06-11 17:50:15 -04:00
Steven Robertson
5b67ed7c33 Fix gutter-trim and compilation keeping 2011-06-11 17:23:29 -04:00
Steven Robertson
6c7d0270ad A few variation fixups 2011-06-11 17:21:34 -04:00
Steven Robertson
e79df46c66 Refactor API
--HG--
rename : cuburn/code/filter.py => cuburn/code/filtering.py
2011-06-11 15:59:10 -04:00
Steven Robertson
6f3c27007a Remove outdated MemBench stuff 2011-06-11 15:58:37 -04:00
Steven Robertson
94c453d153 Filter adjustments (density prefilter, gutter) 2011-06-11 15:58:15 -04:00
Steven Robertson
cd1f905ca3 Fix assembly (don't think this bug ever bit, but it could) 2011-05-29 15:20:58 -04:00
Steven Robertson
daf56ffc53 Split thread group up along warp boundary (this is useful later) 2011-05-29 15:15:06 -04:00
Steven Robertson
923d471e0e Merge memory transaction for slightly less smashing 2011-05-29 15:06:57 -04:00
Steven Robertson
78835085e8 A few trivial syntax corrections in the vars 2011-05-29 14:52:10 -04:00
Steven Robertson
7556d79ae5 Merge Erik's variations (TY!) 2011-05-05 23:36:40 -04:00
Steven Robertson
3d94c256a9 Another non-working checkin 2011-05-05 23:35:54 -04:00
Erik Reckase
0d88608b16 vars basically done. whew. a few are missing, but they're a pain and
i don't feel like doing them right now ;)
2011-05-05 21:16:47 -06:00
Erik Reckase
7d46b0d1db Done with 77. Will pick and choose the rest. 2011-05-05 18:09:23 -06:00
Erik Reckase
ec01cbfc43 up through var 69! 2011-05-05 13:44:12 -06:00
Steven Robertson
fac6f838a4 Saving unsuccessful separable filtering code 2011-05-05 10:40:22 -04:00
Erik Reckase
4f5d7efe27 vars up to 67 complete 2011-05-05 05:48:53 -06:00
Erik Reckase
9f78f5225a vars done through 57 2011-05-04 16:18:29 -06:00
Erik Reckase
d0084aab17 vars done through 49. skipping twintrian for now. 2011-05-04 14:43:17 -06:00
Erik Reckase
645222af47 added some blur variations, up to 36. 2011-05-04 12:05:51 -06:00
Erik Reckase
7a680efa1b added a few more vars, also fixed waves since it was referring to
precalc'd parameters
2011-05-04 11:51:04 -06:00
Steven Robertson
1aafe4a93c Some light performance optimizations 2011-05-04 09:52:20 -04:00
Steven Robertson
be66f80641 Final xforms 2011-05-04 08:13:39 -04:00
Erik Reckase
85ef8e7005 vars added up to 29. 2011-05-04 05:48:50 -06:00
Steven Robertson
e8a31bb4a5 Arbitrary camera, part 2 2011-05-04 01:30:22 -04:00
Steven Robertson
b2ee583b08 Arbitrary camera, part 1 2011-05-04 01:06:18 -04:00
Erik Reckase
765cf6b2e0 up to var22 (fan) completed. 2011-05-03 21:49:14 -06:00
Erik Reckase
d1137e8e89 lots more f's added. vars up to 19 complete. 2011-05-03 21:34:24 -06:00
Erik Reckase
f599685676 added missing f's and removed EPS 1e-20 thingies 2011-05-03 21:19:56 -06:00
Erik Reckase
8b6c6f462e added variations up to #16. 1/6 done! 2011-05-03 21:16:02 -06:00
Steven Robertson
c605815130 Make code more portable 2011-05-03 17:12:12 -04:00
Steven Robertson
eeff0a4d4f Oh, missed some 'f' suffixes on numbers 2011-05-03 16:15:16 -04:00
Steven Robertson
9d969476ec Be a little more particular about how mwc_11 is implemented 2011-05-03 15:26:44 -04:00
Steven Robertson
28c3c72bb8 Dithering 2011-05-03 15:26:05 -04:00
Steven Robertson
8ee5d3edd8 Add a few vars, and support for rendering single flames 2011-05-03 14:36:20 -04:00
Steven Robertson
84c2583ba8 A memory benchmark (temporary) 2011-05-03 13:02:15 -04:00
Steven Robertson
810b263aa2 Fix tex lookups. That was pretty dumb. 2011-05-03 03:23:25 +00:00
Steven Robertson
4612d753cc Switch to 1024x1024 (still fixed, tho) 2011-05-03 01:15:51 +00:00
Steven Robertson
9f3604b670 Fix my idiotic misalignment bug 2011-05-03 01:14:00 +00:00
Steven Robertson
972cd9f9ea Add image writing, and revert the buffer flip 2011-05-02 19:30:14 +00:00
Steven Robertson
1dad09fc03 Um, missed this file. Also, just fixed an obvious memory bug. 2011-05-02 19:29:07 +00:00
Steven Robertson
cd803cb3af Log scaling and color clipping (in a sense) 2011-05-02 16:19:55 +00:00
Steven Robertson
b710de4865 Color palette (sort of) 2011-05-01 15:23:45 -04:00
Steven Robertson
a43973f0ff Motion blur (a bit) 2011-05-01 09:53:36 -04:00
Steven Robertson
a7900f187d Add support for variations.
--HG--
rename : cuburn/variations.py => cuburn/code/variations.py
2011-05-01 09:36:29 -04:00
Steven Robertson
088299423e Some amount of dynamic rendering 2011-04-30 16:40:16 -04:00
Steven Robertson
1302f31ec7 "Crappy whatever I hate it" edition of Sierpinski triangle 2011-04-29 17:25:51 -04:00
Steven Robertson
fe6367821f RNG, again. Hooray. 2011-04-29 11:00:18 -04:00
Steven Robertson
bd1a943914 Start ripping stuff out 2011-04-28 11:24:58 -04:00
Steven Robertson
04351d6582 A final checkin before restarting the project 2011-04-28 10:47:42 -04:00
Steven Robertson
97180003a4 Broken: Variations, CP stream implemented 2010-10-09 11:18:58 -04:00
Steven Robertson
576d2fa683 Switch to pyptx. 2010-10-07 11:21:43 -04:00
Steven Robertson
c0e3c1d599 Known broken checkin because I'm nervous. 2010-10-01 01:20:20 -04:00
Steven Robertson
b938c320a8 Last touchups before ripping out the DSL 2010-09-13 12:22:08 -04:00
Steven Robertson
e4aac6993f A few touchups 2010-09-13 00:20:15 -04:00
Steven Robertson
e0b218feba A new (somewhat experimental) approach to fusing 2010-09-12 23:45:38 -04:00
Steven Robertson
5a5fcf5bb9 Fix the unbelieveably stupid bug I've been chasing for days. 2010-09-12 18:42:52 -04:00
Steven Robertson
2f48d01aa9 Fix linear variation typo 2010-09-12 17:38:51 -04:00
Steven Robertson
5c5122e8c8 Optimization doubles performance... but breaks the output (even more) 2010-09-12 17:17:08 -04:00
Steven Robertson
3e4e1d88a2 Allow device call exceptions to propagate after cleanup 2010-09-12 16:22:56 -04:00
Steven Robertson
70ca6d7729 Fix RNG test 2010-09-12 16:22:22 -04:00
Steven Robertson
a6141f492d A byte is *8* bits 2010-09-12 15:48:31 -04:00
Steven Robertson
7ef0d334ca ...except I missed the file that actually contained the new method 2010-09-12 14:06:07 -04:00
Steven Robertson
6ed8907fcb LaunchContext.get_per_thread 2010-09-12 13:45:55 -04:00
Steven Robertson
3265982fec Change 'ctx.threads' to 'ctx.nthreads', as it should have been from the start 2010-09-12 11:13:53 -04:00
Steven Robertson
a439bf671d Fix occupancy issues (1 block/SM when shuffle was on).
There are 16 bar.sync() registers available per *chip*, not per block, and I
was using number 8 in the shuffle code. Evidently the driver rewrites them per
SM, but does not compact their range. Good to know.
2010-09-12 11:09:47 -04:00
Steven Robertson
c13f6a06cf Experiments with larger CTAs for IterThread 2010-09-12 02:01:03 -04:00
Steven Robertson
e2b1c161cf More readable memory allocations 2010-09-12 01:13:22 -04:00
Steven Robertson
802ca1d585 Allow swapping out store methods for easier testing of performance 2010-09-12 01:09:04 -04:00
Steven Robertson
f368a99a16 Shuffle points between threads of a CTA 2010-09-12 00:17:18 -04:00
Steven Robertson
40a5ceafde Use a somewhat better writeback mechanism for now 2010-09-12 00:16:35 -04:00
Steven Robertson
aa688564f1 Add Timeouter, for timing out infinite loops so data can be recovered. 2010-09-11 13:18:40 -04:00
Steven Robertson
a5d7c2cc1a Use variations. This works, but is still fragile. 2010-09-11 13:15:36 -04:00
Steven Robertson
860d7b2fad Add xforms and variations. 2010-09-11 13:10:41 -04:00
Steven Robertson
56404b629f Add device assertions to standard library. 2010-09-11 00:12:02 -04:00
Steven Robertson
3932412539 Test to make sure floating point numbers were in the right range. 2010-09-10 19:36:39 -04:00
Steven Robertson
e71a8422e5 Make store_per_thread reuse gtid in multiple calls when possible 2010-09-10 18:45:32 -04:00
Steven Robertson
943e92b80c Use pycuda SourceModule to work around crashes, and a few invocation touchups. 2010-09-10 18:02:37 -04:00
Steven Robertson
c3d12d07c2 Fix MWCRNGTest. 2010-09-10 18:01:50 -04:00
Steven Robertson
36f1c1c056 Rename "cuburnlib" (stupid) to "cuburn" (stupid but shorter)
--HG--
rename : cuburnlib/__init__.py => cuburn/__init__.py
rename : cuburnlib/cuda.py => cuburn/cuda.py
rename : cuburnlib/device_code.py => cuburn/device_code.py
rename : cuburnlib/ptx.py => cuburn/ptx.py
rename : cuburnlib/render.py => cuburn/render.py
2010-09-10 14:48:34 -04:00