Commit Graph

241 Commits

Author SHA1 Message Date
Steven Robertson
df8100d1f4 Use new mad.cc instruction in MWC 2011-12-08 11:49:31 -05:00
Steven Robertson
094df0ae21 Name the variation templates for debugging 2011-12-08 11:48:38 -05:00
Steven Robertson
e79d9a58fd Fix cschden, cothden variations 2011-12-07 13:41:45 -05:00
erik
5ce5763da7 changed sense (and name) of values for affine xforms to match smoulder renders. 2011-11-23 07:54:48 -07:00
Steven Robertson
22fdc98128 Fix point swapping. 2011-11-20 10:08:13 -05:00
erik
efd261bd5b fixes related to interpolation of palettes; hsv interpolation now goes
the 'short way' around the hue circle, and the correct palette is now
chosen when > 2 palettes are present in the knots.
2011-11-14 19:12:41 -07:00
Steven Robertson
0f848b8bb8 Dither color when packing for deferred write. 2011-11-12 11:06:44 -05:00
Steven Robertson
6d1c81486c Don't inline catmull_rom for much faster compiles. 2011-11-12 11:05:44 -05:00
Steven Robertson
24c0c8ee56 Fix some color foibles (more yet remain) 2011-11-12 10:42:02 -05:00
Steven Robertson
9ef5363652 Fix dumb overflow bug 2011-11-11 17:54:33 -05:00
Steven Robertson
eb43b151dc Deferred writeback. 2011-11-11 17:37:27 -05:00
Steven Robertson
05e1d08681 Add -1-skipping to sort. 2011-11-11 17:34:43 -05:00
Steven Robertson
54f411878b Experiments with multi-pass sort (still has bugs) 2011-11-10 10:49:35 -05:00
Steven Robertson
13842196ea Generalize the sort. 2011-11-09 12:00:59 -05:00
Steven Robertson
3147fd40d2 Support CUDA 4.1. Split filtering into new module.
The new toolkit generates code for filtering which uses too many
registers, so this change splits filtering into its own module so that
it can have separate register usage limits during compiling. As a bonus,
this should improve startup time in general, since the filtering code
is now fixed and does not need to be recompiled.
2011-11-08 14:38:45 -05:00
Steven Robertson
cea91d75bf A very fast key-only radix sort. 2011-11-07 23:23:20 -05:00
Steven Robertson
7815c13ba4 Fix camera offset WRT flam3 2011-11-06 10:01:26 -05:00
erik
5179c98254 fixed flawed lazysusan variation. added whorl variation. 2011-11-03 13:31:32 -06:00
erik
3badf0f826 Merge branch 'master' of git://github.com/stevenrobertson/cuburn
Conflicts:
	cuburn/code/filtering.py
2011-11-03 13:27:31 -06:00
erik
8ea057ff96 fixed highlight_power functionality difference between cuburn and smoulder 2011-11-03 13:18:43 -06:00
Erik Reckase
d382e0f14a Fix highlight power 2011-10-31 17:20:13 -04:00
Steven Robertson
b240fc8459 Use custom "cross" filtering.
Sobel was giving too many false positives. This cross seems to detect
the kinds of edges we care about and avoids the rest of the image, and
it does so on pretty much everything I've tried it on. Very satisfying.
2011-10-29 17:36:06 -04:00
Steven Robertson
0936e34b88 Fix cards stalling out on degenerate flames 2011-10-29 11:14:11 -04:00
Steven Robertson
6204f36ebc Fix spline derivative calculation. 2011-10-29 10:51:33 -04:00
Steven Robertson
bfff915b7e Two very obvious spline bugs fixed. 2011-10-28 21:34:42 -04:00
Steven Robertson
28e73d08ee Add derivative support to SplWrap. 2011-10-28 18:51:33 -04:00
Steven Robertson
a2c4c90cb2 Upgrade fuse, because, well, why not? 2011-10-28 08:41:20 -04:00
Steven Robertson
a6177edc0d Drop the RNG mult shuffle.
It's surprisingly time-consuming, and until I have data showing that we
need it, I'm going to leave this bit of extra randomness voodoo out.
2011-10-28 08:36:27 -04:00
Steven Robertson
185823ad55 Rearrange the main render loop... again.
Using one stream with two pagelocked host buffers allows us to keep the
GPU work queue full without pegging the CPU, and also reduces the
incidences where a host buffer will get overwritten before it can be
written. devtid() was flaky, so this patch also introduces a ringbuffer
to handle the 'slots' concept. It also introduces an adaptive number of
temporal samples, which improves efficiency but also killed the
assumption that (ntemporal_samples % 256 == 0), which required some
additional fixes.
2011-10-28 08:30:36 -04:00
Steven Robertson
15f88383b1 Experimental: real Sobel gradient detection 2011-10-28 08:25:00 -04:00
Steven Robertson
9b2b3ba011 Fix buffer overrun in filtering 2011-10-28 08:24:16 -04:00
Steven Robertson
6b2cb024ac Expand max filter radius to 21 pixels 2011-10-28 08:23:33 -04:00
Steven Robertson
f3a79b200c New badvals mechanism. 2011-10-27 12:59:58 -04:00
Steven Robertson
cac9b691a8 Add a missing semicolin in disc2 2011-10-27 10:37:12 -04:00
Steven Robertson
77daf5e639 Correct blur radius after Box-Muller 2011-10-27 10:36:44 -04:00
Steven Robertson
1faffa1d14 'fill_dptr' instead of 'zero_dptr' 2011-10-27 10:35:01 -04:00
Steven Robertson
3c1dac530b Updates to run_job.py 2011-10-27 10:26:30 -04:00
Steven Robertson
5368a9254a Clamp DE radius further.
The maximum standard deviation pushes far too hard into the limits of
the filter width, giving discrete points a weird boxy blur. The filter
slice width needs to be expanded, but that's a whole lot of coeffecient
debugging, and I'm putting it off by just reducing the maximum DE width
for now.
2011-10-27 08:58:51 -04:00
Steven Robertson
9049902b4f Add a crap gradient detect to make DE less bad.
Use the vertical and horizontal gradients to "detect" when a pixel is
part of an edge that has been softened by grid-shift AA, and avoid
blurring it further. This causes occasional 1px artifacts in stills, but
fixes the truly grotesque DE bleed-out for a net win. A better edge
detector is still needed.
2011-10-27 08:51:40 -04:00
Steven Robertson
7c84c6a7a9 Final xform color *is* used after all 2011-10-27 08:46:55 -04:00
Steven Robertson
f650844cb9 Fix two variations 2011-10-26 08:11:10 -04:00
Steven Robertson
a8528a9e1d Fix rgb2hsv 2011-10-26 08:10:57 -04:00
Steven Robertson
376cd752d6 Palette interpolation on device 2011-10-25 22:56:19 -04:00
Steven Robertson
e793527c29 A few harmless const modifiers 2011-10-25 22:49:26 -04:00
Steven Robertson
3436291eb6 Improve spline loading 2011-10-25 19:03:35 -04:00
Steven Robertson
fb5bdc2a9f Remove now-unused pyflam3_hacks 2011-10-25 19:03:10 -04:00
Steven Robertson
8939a6343a New genome representation, and device interp. 2011-10-25 15:44:39 -04:00
Steven Robertson
be31708c09 Fix memory corruption bug (overshoot in colorclip) 2011-10-25 15:43:05 -04:00
Steven Robertson
efc2ac23e2 Fix rendering at insane resolutions 2011-10-19 14:17:01 -04:00
Steven Robertson
20520d2f69 Open primes.bin in binary mode. 2011-10-17 19:31:09 -04:00
Steven Robertson
6c2df777b0 Remove a TODO 2011-10-16 13:52:01 -04:00
Steven Robertson
8ce2470dfb Relax FUSE a little (no visible impact so far) 2011-10-16 13:45:27 -04:00
Steven Robertson
c4ce3cf4c2 Don't crash on empty render(times) 2011-10-16 13:44:22 -04:00
Steven Robertson
0cc904c4f1 Do post affine transforms. How did I miss this? 2011-10-16 13:43:46 -04:00
Steven Robertson
5111a0f05c Eliminate needless pre_ var separation 2011-10-16 13:42:37 -04:00
Steven Robertson
9bafbda81a Refactor host rendering code for better load 2011-10-15 22:22:43 -04:00
Steven Robertson
9ff018de87 Actually fix dithering. (I've seen this before...) 2011-10-15 19:08:16 -04:00
Steven Robertson
63483480d0 Bias the radius to avoid very large dither offsets 2011-10-15 00:50:24 -04:00
Steven Robertson
3be14547ea Use 3*256 instead of 2*512 blocks; faster on GF104 2011-10-15 00:33:37 -04:00
Steven Robertson
c7728d3507 Add faster no-L1 accum 2011-10-15 00:32:30 -04:00
Steven Robertson
dd645bcbf6 Use one dither offset per block. 2011-10-15 00:29:22 -04:00
Steven Robertson
83670df2c7 Fix random seeds. 2011-10-14 11:56:58 -04:00
Steven Robertson
b168a2431e 32-bit compatibility (I think?) 2011-10-13 16:56:20 -04:00
Steven Robertson
14872ee6ed Add --sleep for slightly more usable system 2011-10-13 16:55:26 -04:00
Steven Robertson
e6e2c4a8d7 Add --sync option. 2011-10-13 07:53:55 -04:00
Steven Robertson
4834c9fdfa Change synchronization model. 2011-10-12 14:08:13 -04:00
Steven Robertson
81f61d4d5d Improve asynchrony; improve palette interp perf. 2011-10-12 14:07:28 -04:00
Steven Robertson
7b9bb165ac Change the way compile options are handled 2011-10-12 14:02:32 -04:00
Steven Robertson
f04ad7ab68 Performance improvements in Genome.__init__() 2011-10-12 13:57:43 -04:00
Steven Robertson
0f615bd98b Performance improvements in affine helpers 2011-10-12 13:56:34 -04:00
Steven Robertson
d409f02e4a Precompile accessors.
This improves packing speed by 8x, which is visible on small or
low-quality frames.
2011-10-12 11:50:07 -04:00
Steven Robertson
a12714f4c4 Fix MWC test 2011-10-12 07:36:07 -04:00
Steven Robertson
9b03f557c2 Fix missing control points in async version.
The allocation pool was reallocating the same frame as soon as it had
left the current scope, before it had been copied. We just reallocate
the same chunks. I don't think this has any real performance impact but
this can be verified.
2011-10-11 20:54:33 -04:00
Steven Robertson
b081bc9378 Remove a sync from iter.
A small but consistent improvement.
2011-10-11 14:56:23 -04:00
Steven Robertson
095936666e Actually asynchronous rendering.
This change didn't affect GPU performance at all, but it did improve CPU
startup time, and should also improve time for long-running animations.
2011-10-11 11:27:40 -04:00
Steven Robertson
8c7e86c7c7 Fixed fraction to not exceed range 2011-10-11 11:26:38 -04:00
Steven Robertson
618b51b1b1 Speed enhancement: alpha packing.
When the alpha channel is used in a color palette, the code now replaces
the blue channel in the accumulation buffer with a pair of two U16s,
which encode the values of the blue and alpha channels as a fraction of
the value of the density. When the alpha channel is always 1.0, the blue
channel works as normal. Density is now always the last element in the
accumulation buffer.

Eliminating the separate IO operations improved total runtime by more
than 30% on my card, while the extra calculations reduced that to 20%
when alpha was present (though that can be optimized further).
2011-10-11 09:57:37 -04:00
Steven Robertson
46c6074b92 Use C++ pass-by-reference to explicitly share. 2011-10-03 16:53:29 -04:00
Erik Reckase
851980f682 mobius d params were missing 2011-07-06 12:47:03 -06:00
Steven Robertson
18a60ec066 Major bugfix. Also include thread-swapping that works. 2011-06-25 20:37:08 -04:00
Erik Reckase
44f897f28e fixed enable/disable chaos 'if chain' in iter kernel function 2011-06-24 09:59:14 -06:00
Erik Reckase
b732a3c244 now the chaos 'if chain' is only used if there are non-unity chaos entries in the genome. 2011-06-24 08:18:08 -06:00
Erik Reckase
50b664b1f9 chaos support \0/ 2011-06-24 06:09:04 -06:00
Erik Reckase
bc2aa00e2a removed stray debug printfs 2011-06-21 11:24:06 -06:00
Erik Reckase
addad052b1 closes 11 - all black pixels with non-zero density were not being handled properly. 2011-06-21 11:22:20 -06:00
Erik Reckase
981de94be5 added <stdio.h> for printf support in cuda code 2011-06-21 11:17:11 -06:00
Erik Reckase
746185ce4d added support for pre_blur. all variations that start with 'pre_' will be applied to the
output of the affine transform before the other variation contributions are calculated.
2011-06-20 14:05:00 -06:00
Steven Robertson
c66cb463d4 Add background color support, and tentatively disable density blurring. 2011-06-19 00:30:54 -04:00
Steven Robertson
883de380fc Did check. It is right. 2011-06-18 22:30:09 -04:00
Erik Reckase
98fb376545 fixed extra ) in var59 2011-06-17 16:36:27 -06:00
Erik Reckase
f684f90956 fixed a few more variations 2011-06-17 13:00:57 -06:00
Erik Reckase
3ee437d9b2 more fixes for variations...just about have all of the written ones validated. 2011-06-17 10:24:13 -06:00
Erik Reckase
6cd4f328f0 fixes for various variations 2011-06-16 21:25:06 -06:00
Steven Robertson
9e74ff57ce Fix julia variation. Closes issue 10. 2011-06-16 13:42:17 -04:00
Erik Reckase
8a3365712c fixed super_shape 2011-06-16 10:23:47 -06:00
Erik Reckase
e05d43fc57 fixed pie variation. 2011-06-16 06:50:26 -06:00
Erik Reckase
e83e67b440 fixed waves variation. 2011-06-16 05:34:46 -06:00
Erik Reckase
842efb6317 more variation fixes, syntax errors and so on. 2011-06-15 20:21:40 -06:00
Steven Robertson
702e303509 Gaussian dither 2011-06-13 23:20:39 -04:00
Steven Robertson
e579c837ce Missed a double in the filtering 2011-06-13 00:50:41 -04:00