Commit Graph

33 Commits

Author SHA1 Message Date
Steven Robertson
caf9014a29 Run a bit more without CUDA 2013-01-06 22:03:03 -08:00
Steven Robertson
3a6490825b Fix an interpolation code-gen issue.
A bug in the hacky use of OrderedDict to function as an OrderedSet meant
that any value acceessed in a precalculation function which had already
been accessed from another precalculation function would use an
incorrect index.
2012-07-21 13:14:09 -07:00
Steven Robertson
805742d4ce Add arch as compiler option 2012-05-20 12:53:23 -07:00
Steven Robertson
f5ef30bc1b Clean up some filter code 2012-04-16 00:39:22 -07:00
Steven Robertson
c57917abe6 Use a unified block and grid addressing scheme. 2012-04-06 21:24:25 -07:00
Steven Robertson
88abefa4f4 Fix rb_incr() when blockDim.y == 1. 2012-02-15 10:06:35 -05:00
Steven Robertson
60a45c9a20 Sweeping refactor. More bugs undoubtedly remain. 2012-02-14 07:40:58 -08:00
Steven Robertson
c572f62d7d Use YUV during accumulation 2012-01-22 23:56:16 -05:00
Steven Robertson
a803216551 Move argset to code.util 2012-01-21 00:03:28 -05:00
Steven Robertson
de56383a61 Add new palette modes; use 'yuv' by default. 2011-12-23 09:50:03 -05:00
Steven Robertson
c80b8a07a7 Another incompatible update to the genome format 2011-12-17 09:23:39 -05:00
erik
efd261bd5b fixes related to interpolation of palettes; hsv interpolation now goes
the 'short way' around the hue circle, and the correct palette is now
chosen when > 2 palettes are present in the knots.
2011-11-14 19:12:41 -07:00
Steven Robertson
eb43b151dc Deferred writeback. 2011-11-11 17:37:27 -05:00
Steven Robertson
185823ad55 Rearrange the main render loop... again.
Using one stream with two pagelocked host buffers allows us to keep the
GPU work queue full without pegging the CPU, and also reduces the
incidences where a host buffer will get overwritten before it can be
written. devtid() was flaky, so this patch also introduces a ringbuffer
to handle the 'slots' concept. It also introduces an adaptive number of
temporal samples, which improves efficiency but also killed the
assumption that (ntemporal_samples % 256 == 0), which required some
additional fixes.
2011-10-28 08:30:36 -04:00
Steven Robertson
f3a79b200c New badvals mechanism. 2011-10-27 12:59:58 -04:00
Steven Robertson
1faffa1d14 'fill_dptr' instead of 'zero_dptr' 2011-10-27 10:35:01 -04:00
Steven Robertson
a8528a9e1d Fix rgb2hsv 2011-10-26 08:10:57 -04:00
Steven Robertson
376cd752d6 Palette interpolation on device 2011-10-25 22:56:19 -04:00
Steven Robertson
8939a6343a New genome representation, and device interp. 2011-10-25 15:44:39 -04:00
Steven Robertson
efc2ac23e2 Fix rendering at insane resolutions 2011-10-19 14:17:01 -04:00
Steven Robertson
c7728d3507 Add faster no-L1 accum 2011-10-15 00:32:30 -04:00
Steven Robertson
dd645bcbf6 Use one dither offset per block. 2011-10-15 00:29:22 -04:00
Steven Robertson
d409f02e4a Precompile accessors.
This improves packing speed by 8x, which is visible on small or
low-quality frames.
2011-10-12 11:50:07 -04:00
Steven Robertson
8c7e86c7c7 Fixed fraction to not exceed range 2011-10-11 11:26:38 -04:00
Steven Robertson
618b51b1b1 Speed enhancement: alpha packing.
When the alpha channel is used in a color palette, the code now replaces
the blue channel in the accumulation buffer with a pair of two U16s,
which encode the values of the blue and alpha channels as a fraction of
the value of the density. When the alpha channel is always 1.0, the blue
channel works as normal. Density is now always the last element in the
accumulation buffer.

Eliminating the separate IO operations improved total runtime by more
than 30% on my card, while the extra calculations reduced that to 20%
when alpha was present (though that can be optimized further).
2011-10-11 09:57:37 -04:00
Erik Reckase
981de94be5 added <stdio.h> for printf support in cuda code 2011-06-21 11:17:11 -06:00
Steven Robertson
5ebf62b1a3 Reduce some double-precision constants 2011-06-13 00:48:31 -04:00
Steven Robertson
e79df46c66 Refactor API
--HG--
rename : cuburn/code/filter.py => cuburn/code/filtering.py
2011-06-11 15:59:10 -04:00
Steven Robertson
1aafe4a93c Some light performance optimizations 2011-05-04 09:52:20 -04:00
Steven Robertson
be66f80641 Final xforms 2011-05-04 08:13:39 -04:00
Steven Robertson
b2ee583b08 Arbitrary camera, part 1 2011-05-04 01:06:18 -04:00
Steven Robertson
9f3604b670 Fix my idiotic misalignment bug 2011-05-03 01:14:00 +00:00
Steven Robertson
1dad09fc03 Um, missed this file. Also, just fixed an obvious memory bug. 2011-05-02 19:29:07 +00:00