cuburn

mirror of https://github.com/stevenrobertson/cuburn.git synced 2025-08-01 13:05:25 -04:00

Author	SHA1	Message	Date
Steven Robertson	112a674520	Redesign distribution: now based on ssh, not zmq	2017-05-15 12:04:16 -07:00
Steven Robertson	c7654357a6	Move naming code into a common place	2017-05-15 12:01:59 -07:00
Steven Robertson	04702d7903	Add --list-devices option	2017-05-15 12:01:25 -07:00
Steven Robertson	29c595ddc5	Move most warning/info statements to stderr	2017-05-15 12:00:11 -07:00
Steven Robertson	9bcfc36b7a	Retrieve out suffix without creating a renderer	2017-05-15 11:56:37 -07:00
Steven Robertson	636efcd059	Drop GL mode in main.py; sleep to reduce load	2017-05-15 00:44:15 -07:00
Steven Robertson	7dc58a0e1c	Grow launch sizes and synchronize if they pile up	2017-05-15 00:43:10 -07:00
Steven Robertson	5402838a74	Disable ill-thought-through form of antialiasing	2017-05-15 00:41:30 -07:00
Steven Robertson	3528cd1da4	Force use of clang for compilation for Debian	2017-05-15 00:38:52 -07:00
Steven Robertson	f58289af53	Hotspot writeback. 10x performance increase. Create a map assigning two bits to every output bin. During the atomic flush, compute a threshold for discarding writes altogether that would keep us under 2% error - discard 1 of every 2 writes if we've already accumulated 64 writes (hotspot value 1), 7 of 8 if we're above 256 (hotspot value 2), or 31 of 32 at 2048 (hotspot value 3). Pack this value into a read-only buffer that can often be cached at L2, and for particularly concentrated flames (which historically choke cuburn), L1. During writeback, discard writes at the apporpriate rate. During the flush of the integer accumulator to the float, scale the integer accumulators by the discard rate. This works because for most flames, there's not a lot of interesting stuff in the middle regimes; either stuff is very well defined, in which case we pretty much know exactly what the color is going to be (remember, the max 2% relative error gets log-scaled as well), or it's loosely defined so we should keep it at full accuracy. Of course, a 10x boost is best-case-ish - a long, high-res render. I realized though that I really didn't care about low quality stuff and should go for broke optimizing this for my use case, which is ridiculously high res HDR stuff. (On pathological flames, on the other hand, 10x is conservative; this easily gives us 100x.)	2017-05-09 21:16:43 -07:00
Steven Robertson	0bcde947b5	Go to 1024 contexts on Pascal	2017-05-09 21:15:03 -07:00
Steven Robertson	d1502e3b79	rings2 is not identity at high precision	2017-05-09 21:09:58 -07:00
Steven Robertson	d759d675be	Always flush status lines	2017-05-09 21:09:40 -07:00
Steven Robertson	5af90b01a2	Fix a silly 'except e' (too much yavascrip in my life)	2017-05-09 21:09:00 -07:00
Steven Robertson	8fe4fbec1c	Use yield scheduling to reduce CPU load	2017-05-09 21:07:58 -07:00
Steven Robertson	77afb2f4b5	Turns out spread is period 180 but spin 1/2	2017-05-09 19:59:21 -07:00
Steven Robertson	8f21ffd4c3	Add right buffer. 2x allocation of uchar buffer.	2017-05-02 00:11:03 -07:00
Steven Robertson	3a3b3b33d1	Rename d_side to d_left (to add d_right later)	2017-05-02 00:07:08 -07:00
Steven Robertson	f83e36d948	Add prores as an option on the command line	2017-04-24 16:39:44 -07:00
Steven Robertson	9892acbc7f	Populate arch by default; add --keep	2017-04-24 16:39:15 -07:00
Steven Robertson	582221dd0f	Always spit out lineinfo when possible	2017-04-24 16:38:51 -07:00
Steven Robertson	6bf428caee	Move 'mktref' to util	2017-04-24 16:38:39 -07:00
Steven Robertson	bdcaca1f97	Initial draft of hotspot deferral. Build an array of one-bit flags for every pixel (addressed as u32 data). If we have accumulated at least 64 points for that pixel, set the flag; thereafter only write 1/16 (and multiply subsequent points that do get written by 16). The theory is, after 64 points, the color is pretty much locked in; this lets us crank SPP up to get excellent coverage in dark areas but the bright ones don't matter so much since they're fully resolved. Still needs a lot of tuning to get peak performance, and the trigger threshold may need to be scaled along with the render size. It also will likely not scale as well to higher resolutions, because we rely on L2 cache to make this fast.	2017-04-24 16:33:39 -07:00
Steven Robertson	6b2b72a3fe	Remove unused texture reference	2017-04-23 01:15:51 -07:00
Steven Robertson	c79db04490	Choose a GPU in main.py	2017-04-21 13:08:16 -07:00
Steven Robertson	b507c9d604	Make tiffs 16-bit using tifffile	2017-04-20 18:22:27 -07:00
Steven Robertson	c6fcaf472f	Be sure to close the output files in main.py.	2017-04-20 17:52:37 -07:00
Steven Robertson	746aee9a75	Add a 'plainclip' filter. This is useful for doing color manipulation on output renders, either by hand in color grading software or automatically using renormalization based on image statistics.	2017-04-20 17:51:05 -07:00
Steven Robertson	14f755e434	Add an FFmpeg ProRes handler.	2017-04-20 17:50:33 -07:00
Steven Robertson	96585e2ca5	Update YUV444p12 to be Rec. 709, studio swing.	2017-04-20 17:42:08 -07:00
Steven Robertson	f64cf79d8d	Fixes to parse all gen198 genomes. Named palettes.	2017-04-20 13:54:46 -07:00
Steven Robertson	d1228ac303	Add a simple graph-walker for playback	2015-10-26 01:35:09 -07:00
Steven Robertson	36e3b7aca9	Compiler made register restrictions unnecessary	2015-10-26 01:34:35 -07:00
Steven Robertson	17b5a1a96f	Spit out raw content. Previewing using an Intensity Pro 4K and secret monitor-sauce.	2015-10-11 00:52:26 -07:00
Steven Robertson	cfe815f0d6	Min/max YUV output for 444p12. Again, better solution forthcoming.	2015-10-11 00:51:58 -07:00
Steven Robertson	5f5e69f3a3	Crank up the dispatch params again after fix.	2015-10-11 00:51:22 -07:00
Steven Robertson	0e91d01528	Fix the noise issues on Maxwell GPUs. AAAAUUUUUUGGGGGGHHH	2015-10-11 00:49:37 -07:00
Steven Robertson	37e6642d37	Add logencode filter.	2015-10-10 18:04:27 -07:00
Steven Robertson	37245085a9	Add pre-YUV output clamping to YUV444P12. A better solution would cover all the ouput transforms, but... later.	2015-10-10 18:03:17 -07:00
Steven Robertson	0476bbfdce	Fix default width/height	2015-10-10 16:01:58 -07:00
Steven Robertson	f93b4dbf23	Add YUV444P12 support	2015-10-10 16:01:11 -07:00
Steven Robertson	0ce1b51d16	Convert YUV to RGB before filtering	2015-10-10 15:59:57 -07:00
Steven Robertson	abcb3fa50f	Look up renderers by name, rather than position	2015-10-10 15:58:36 -07:00
Steven Robertson	698d9c2337	Register filters with a class decorator	2015-10-10 15:58:13 -07:00
Steven Robertson	227a6016c2	Enable VP9 ARNR	2015-02-15 10:24:23 -08:00
Steven Robertson	51b1280e1e	Better status messages for main.py	2015-02-14 17:51:13 -08:00
Steven Robertson	e70073175e	Fix codec naming issue for jpeg	2015-02-14 17:50:19 -08:00
Steven Robertson	8dc629d91e	Autoselect number of columns to use for VP9	2015-02-14 17:50:03 -08:00
Steven Robertson	e08444f74b	Work around an overflow condition for now. I'm not sure what's going wrong; the math still holds up at higher densities, but when you crank up the samples-per-pixel count the accumulators start overflowing stochastically, and when they do they dump nonsense into the output. Until I have time, take a small perf hit by flushing much more often.	2015-02-14 17:48:22 -08:00
Steven Robertson	8da1821616	Bump gutter to 12px to align reads	2014-12-25 15:04:31 -08:00

1 2 3 4 5 ...

520 Commits