0.4.1.5 Beta 11/28/2014

--User Changes
 Remove limit on the number of xforms allowable on the GPU. This was previously 21.
 Show actual strips count to be used in parens outside of user specified strips count on final render dialog.
 Allow for adjustment of iteration depth and fuse count per ember and save/read these values with the xml.
 Iteration optimizations on both CPU and GPU.
 Automatically adjust default quality spinner value when using CPU/GPU to 10/30, respectively.

--Bug Fixes
 Fix severe randomization bug with OpenCL.
 Fix undo list off by one error when doing a new edit anywhere but the end of the undo list.
 Make integer variation parameters use 4 decimal places in the variations list like all the others.
 New build of the latest Qt to fix scroll bar drawing bug.
 Prevent grid from showing as much when pressing control to increase a spinner's increment speed. Still shows sometimes, but better than before.

--Code Changes
 Pass count and fuse to iterator as a structure now to allow for passing more params in the future.
 Slightly different grid/block logic when running DE filtering on the GPU.
 Attempt a different way of doing DE, but #define out because it ended up not being faster.
 Restructure some things to allow for a variable length xforms buffer to be passed to the GPU.
 Add sub batch size and fuse count as ember members, and remove them from the renderer classes.
 Remove m_LastPass from Renderer. It should have been removed with passes.
 Pass seeds as a buffer to the OpenCL iteration kernel, rather than a single seed that gets modified.
 Slight optimization on CPU accum.
 Use case statement instead of if/else for xform chosing in OpenCL for a 2% speedup on params with large numbers of xforms.
 Add SizeOf() wrapper around sizeof(vec[0]) * vec.size().
 Remove LogScaleSum() functions from the CPU and GPU because they're no longer used since passes were removed.
 Make some OpenCLWrapper getters const.
 Better ogranize RendererCL methods that return grid dimensions.
This commit is contained in:
mfeemster
2014-11-28 01:37:51 -08:00
parent 3f29025f99
commit b29bedec38
39 changed files with 905 additions and 392 deletions

View File

@ -42,14 +42,27 @@ public:
//Non-virtual member functions for OpenCL specific tasks.
bool Init(unsigned int platform, unsigned int device, bool shared, GLuint outputTexID);
bool SetOutputTexture(GLuint outputTexID);
inline unsigned int IterCountPerKernel();
inline unsigned int IterBlocksWide();
inline unsigned int IterBlocksHigh();
inline unsigned int IterBlockWidth();
inline unsigned int IterBlockHeight();
inline unsigned int IterGridWidth();
inline unsigned int IterGridHeight();
inline unsigned int TotalIterKernelCount();
//Iters per kernel/block/grid.
inline unsigned int IterCountPerKernel() const;
inline unsigned int IterCountPerBlock() const;
inline unsigned int IterCountPerGrid() const;
//Kernels per block.
inline unsigned int IterBlockKernelWidth() const;
inline unsigned int IterBlockKernelHeight() const;
inline unsigned int IterBlockKernelCount() const;
//Kernels per grid.
inline unsigned int IterGridKernelWidth() const;
inline unsigned int IterGridKernelHeight() const;
inline unsigned int IterGridKernelCount() const;
//Blocks per grid.
inline unsigned int IterGridBlockWidth() const;
inline unsigned int IterGridBlockHeight() const;
inline unsigned int IterGridBlockCount() const;
unsigned int PlatformIndex();
unsigned int DeviceIndex();
bool ReadHist();
@ -58,6 +71,9 @@ public:
bool ClearHist();
bool ClearAccum();
bool WritePoints(vector<PointCL<T>>& vec);
#ifdef TEST_CL
bool WriteRandomPoints();
#endif
string IterKernel();
//Virtual functions overridden from RendererCLBase.
@ -98,7 +114,7 @@ private:
eRenderStatus RunDensityFilter();
eRenderStatus RunFinalAccum();
bool ClearBuffer(const string& bufferName, unsigned int width, unsigned int height, unsigned int elementSize);
bool RunDensityFilterPrivate(unsigned int kernelIndex, unsigned int gridW, unsigned int gridH, unsigned int blockW, unsigned int blockH, unsigned int chunkSizeW, unsigned int chunkSizeH, unsigned int rowParity, unsigned int colParity);
bool RunDensityFilterPrivate(unsigned int kernelIndex, unsigned int gridW, unsigned int gridH, unsigned int blockW, unsigned int blockH, unsigned int chunkSizeW, unsigned int chunkSizeH, unsigned int chunkW, unsigned int chunkH);
int MakeAndGetDensityFilterProgram(size_t ss, unsigned int filterWidth);
int MakeAndGetFinalAccumProgram(T& alphaBase, T& alphaScale);
int MakeAndGetGammaCorrectionProgram();
@ -106,7 +122,7 @@ private:
//Private functions passing data to OpenCL programs.
DensityFilterCL<T> ConvertDensityFilter();
SpatialFilterCL<T> ConvertSpatialFilter();
EmberCL<T> ConvertEmber(Ember<T>& ember);
void ConvertEmber(Ember<T>& ember, EmberCL<T>& emberCL, vector<XformCL<T>>& xformsCL);
static CarToRasCL<T> ConvertCarToRas(const CarToRas<T>& carToRas);
bool m_Init;
@ -122,7 +138,9 @@ private:
//Buffer names.
string m_EmberBufferName;
string m_XformsBufferName;
string m_ParVarsBufferName;
string m_SeedsBufferName;
string m_DistBufferName;
string m_CarToRasBufferName;
string m_DEFilterParamsBufferName;
@ -146,6 +164,8 @@ private:
IMAGEGL2D m_AccumImage;
GLuint m_OutputTexID;
EmberCL<T> m_EmberCL;
vector<XformCL<T>> m_XformsCL;
vector<glm::highp_uvec2> m_Seeds;
Palette<float> m_Dmap;//Used instead of the base class' m_Dmap because OpenCL only supports float textures.
CarToRasCL<T> m_CarToRasCL;
DensityFilterCL<T> m_DensityFilterCL;