mirror of
				https://github.com/stevenrobertson/cuburn.git
				synced 2025-11-04 02:10:45 -05:00 
			
		
		
		
	
		
			
				
	
	
		
			49 lines
		
	
	
		
			1.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			49 lines
		
	
	
		
			1.4 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
Status: passes rudimentary tests
 | 
						|
 | 
						|
Current goals:
 | 
						|
 | 
						|
- Start xforms
 | 
						|
    - xform selection, pre- and post-transform in xform
 | 
						|
    - first of the variations
 | 
						|
 | 
						|
Things to do (rather severely incomplete):
 | 
						|
 | 
						|
- LaunchContext thread distribution based on generated code register count and
 | 
						|
  shared memory size
 | 
						|
- qlocal storage
 | 
						|
    - Performance implications of different state spaces
 | 
						|
    - Shared / cache projected usage and its effect on above
 | 
						|
    - Implement qlocal storage, and hide the complexity
 | 
						|
- The `Feature` class
 | 
						|
    - Transform count and per-transform code layout
 | 
						|
    - Filter size, oversample, final buffer size
 | 
						|
- Buffer allocation, clearing, reading from device
 | 
						|
- Preview window
 | 
						|
    - When/how to sample?
 | 
						|
    - OpenGL interop worth it?
 | 
						|
    - Implement
 | 
						|
- Implement xforms
 | 
						|
- Shuffle
 | 
						|
    - State space implications, you know the drill
 | 
						|
    - Implement
 | 
						|
    - Test effects on quality by masking off writes on all but one lane and
 | 
						|
      boosting the sample density to compensate (muuuuuch later on)
 | 
						|
- DE
 | 
						|
- Clean up code (particularly DSL stuff incl. injector)
 | 
						|
 | 
						|
Things to test:
 | 
						|
 | 
						|
- Debug flag/dict/whatever for entire project in general
 | 
						|
    - Iteration counters for IterThread
 | 
						|
 | 
						|
Things to benchmark:
 | 
						|
 | 
						|
- Kernel invocation and/or interrupt times (will high load freeze X?)
 | 
						|
- MWC float conversion
 | 
						|
- The entire scatter process
 | 
						|
    - Radix sort of writeback coordinates
 | 
						|
    - Log-copy-histogram approach
 | 
						|
    - Direct reductions
 | 
						|
    - Surface loads, stores, reductions
 | 
						|
 |