2010-09-06 14:19:06 -04:00
|
|
|
Status: passes rudimentary tests
|
2010-09-03 00:08:58 -04:00
|
|
|
|
|
|
|
Current goals:
|
|
|
|
|
2010-09-06 14:19:06 -04:00
|
|
|
- Draw some dang points!
|
|
|
|
- Allocate buffer (can it be pre-allocated?)
|
|
|
|
- Direct scatter linear points by GTID from flame number
|
|
|
|
- Re-enable preview window
|
|
|
|
- Execute frame, update texture, repeat
|
|
|
|
- Writeback of points to the buffer
|
|
|
|
- Define writeback class, args
|
|
|
|
- Do camera rotation across frameset
|
|
|
|
- Postpone other kinds of testing and address clamping for now
|
|
|
|
- Start xforms
|
|
|
|
- At first, fixed Sierpinski triangle or something
|
|
|
|
- xform selection, pre- and post-transform in xform
|
|
|
|
- first of the variations
|
2010-09-03 00:08:58 -04:00
|
|
|
|
|
|
|
Things to do (rather severely incomplete):
|
2010-09-06 14:19:06 -04:00
|
|
|
|
2010-09-03 00:08:58 -04:00
|
|
|
- LaunchContext thread distribution based on generated code register count and
|
|
|
|
shared memory size
|
|
|
|
- qlocal storage
|
|
|
|
- Performance implications of different state spaces
|
|
|
|
- Shared / cache projected usage and its effect on above
|
|
|
|
- Implement qlocal storage, and hide the complexity
|
|
|
|
- The `Feature` class
|
|
|
|
- Transform count and per-transform code layout
|
|
|
|
- Filter size, oversample, final buffer size
|
|
|
|
- Buffer allocation, clearing, reading from device
|
|
|
|
- Preview window
|
|
|
|
- When/how to sample?
|
|
|
|
- OpenGL interop worth it?
|
|
|
|
- Implement
|
|
|
|
- Implement xforms
|
|
|
|
- Shuffle
|
|
|
|
- State space implications, you know the drill
|
|
|
|
- Implement
|
|
|
|
- Test effects on quality by masking off writes on all but one lane and
|
|
|
|
boosting the sample density to compensate (muuuuuch later on)
|
2010-09-06 14:19:06 -04:00
|
|
|
- DE
|
2010-09-06 16:09:37 -04:00
|
|
|
- Clean up code (particularly DSL stuff incl. injector)
|
2010-09-06 14:19:06 -04:00
|
|
|
|
|
|
|
Things to test:
|
|
|
|
|
|
|
|
- DeviceStream allocator and proper handling of corner cases
|
|
|
|
- Debug flag/dict/whatever for entire project in general
|
|
|
|
- Iteration counters for IterThread
|
|
|
|
|
|
|
|
Things to benchmark:
|
|
|
|
|
|
|
|
- Kernel invocation and/or interrupt times (will high load freeze X?)
|
|
|
|
- 1D/2D texture load+interpolation speeds vs constant memory loading
|
|
|
|
- Must test under high SFU load
|
|
|
|
- Tex uses separate cache? Has lower bandwidth penalty for gather?
|
|
|
|
- MWC float conversion
|
|
|
|
- The entire scatter process
|
|
|
|
- Radix sort of writeback coordinates
|
|
|
|
- Log-copy-histogram approach
|
|
|
|
- Direct reductions
|
|
|
|
- Surface loads, stores, reductions
|
|
|
|
|