From 018ba26b5fdcf9d34c8e68fbdede08dc618246c8 Mon Sep 17 00:00:00 2001 From: mfeemster Date: Sat, 12 Sep 2015 18:33:45 -0700 Subject: [PATCH] --User changes -Add support for multiple GPU devices. --These options are present in the command line and in Fractorium. -Change scheme of specifying devices from platform,device to just total device index. --Single number on the command line. --Change from combo boxes for device selection to a table of all devices in Fractorium. -Temporal samples defaults to 100 instead of 1000 which was needless overkill. --Bug fixes -EmberAnimate, EmberRender, FractoriumSettings, FinalRenderDialog: Fix wrong order of arguments to Clamp() when assigning thread priority. -VariationsDC.h: Fix NVidia OpenCL compilation error in DCTriangleVariation. -FractoriumXformsColor.cpp: Checking for null pixmap pointer is not enough, must also check if the underlying buffer is null via call to QPixmap::isNull(). --Code changes -Ember.h: Add case for FLAME_MOTION_NONE and default in ApplyFlameMotion(). -EmberMotion.h: Call base constructor. -EmberPch.h: #pragma once only on Windows. -EmberToXml.h: --Handle different types of exceptions. --Add default cases to ToString(). -Isaac.h: Remove unused variable in constructor. -Point.h: Call base constructor in Color(). -Renderer.h/cpp: --Add bool to Alloc() to only allocate memory for the histogram. Needed for multi-GPU. --Make CoordMap() return a const ref, not a pointer. -SheepTools.h: --Use 64-bit types like the rest of the code already does. --Fix some comment misspellings. -Timing.h: Make BeginTime(), EndTime(), ElapsedTime() and Format() be const functions. -Utils.h: --Add new functions Equal() and Split(). --Handle more exception types in ReadFile(). --Get rid of most legacy blending of C and C++ argument parsing. -XmlToEmber.h: --Get rid of most legacy blending of C and C++ code from flam3. --Remove some unused variables. -EmberAnimate: --Support multi-GPU processing that alternates full frames between devices. --Use OpenCLInfo instead of OpenCLWrapper for --openclinfo option. --Remove bucketT template parameter, and hard code float in its place. --If a render fails, exit since there is no point in continuing an animation with a missing frame. --Pass variables to threaded save better, which most likely fixes a very subtle bug that existed before. --Remove some unused variables. -EmberGenome, EmberRender: --Support multi-GPU processing that alternates full frames between devices. --Use OpenCLInfo instead of OpenCLWrapper for --openclinfo option. --Remove bucketT template parameter, and hard code float in its place. -EmberRender: --Support multi-GPU processing that alternates full frames between devices. --Use OpenCLInfo instead of OpenCLWrapper for --openclinfo option. --Remove bucketT template parameter, and hard code float in its place. --Only print values when not rendering with OpenCL, since they're always 0 in that case. -EmberCLPch.h: --#pragma once only on Windows. --#include . -IterOpenCLKernelCreator.h: Add new kernel for summing two histograms. This is needed for multi-GPU. -OpenCLWrapper.h: --Move all OpenCL info related code into its own class OpenCLInfo. --Add members to cache the values of global memory size and max allocation size. -RendererCL.h/cpp: --Redesign to accomodate multi-GPU. --Constructor now takes a vector of devices. --Remove DumpErrorReport() function, it's handled in the base. --ClearBuffer(), ReadPoints(), WritePoints(), ReadHist() and WriteHist() now optionally take a device index as a parameter. --MakeDmap() override and m_DmapCL member removed because it no longer applies since the histogram is always float since the last commit. --Add new function SumDeviceHist() to sum histograms from two devices by first copying to a temporary on the host, then a temporary on the device, then summing. --m_Calls member removed, as it's now per-device. --OpenCLWrapper removed. --m_Seeds member is now a vector of vector of seeds, to accomodate a separate and different array of seeds for each device. --Added member m_Devices, a vector of unique_ptr of RendererCLDevice. -EmberCommon.h --Added Devices() function to convert from a vector of device indices to a vector of platform,device indices. --Changed CreateRenderer() to accept a vector of devices to create a single RendererCL which will split work across multiple devices. --Added CreateRenderers() function to accept a vector of devices to create multiple RendererCL, each which will render on a single device. --Add more comments to some existing functions. -EmberCommonPch.h: #pragma once only on Windows. -EmberOptions.h: --Remove --platform option, it's just sequential device number now with the --device option. --Make --out be OPT_USE_RENDER instead of OPT_RENDER_ANIM since it's an error condition when animating. It makes no sense to write all frames to a single image. --Add Devices() function to parse comma separated --device option string and return a vector of device indices. --Make int and uint types be 64-bit, so intmax_t and size_t. --Make better use of macros. -JpegUtils.h: Make string parameters to WriteJpeg() and WritePng() be const ref. -All project files: Turn off buffer security check option in Visual Studio (/Gs-) -deployment.pri: Remove the line OTHER_FILES +=, it's pointless and was causing problems. -Ember.pro, EmberCL.pro: Add CONFIG += plugin, otherwise it wouldn't link. -EmberCL.pro: Add new files for multi-GPU support. -build_all.sh: use -j4 and QMAKE=${QMAKE:/usr/bin/qmake} -shared_settings.pri: -Add version string. -Remove old DESTDIR definitions. -Add the following lines or else nothing would build: CONFIG(release, debug|release) { CONFIG += warn_off DESTDIR = ../../../Bin/release } CONFIG(debug, debug|release) { DESTDIR = ../../../Bin/debug } QMAKE_POST_LINK += $$quote(cp --update ../../../Data/flam3-palettes.xml $${DESTDIR}$$escape_expand(\n\t)) LIBS += -L/usr/lib -lpthread -AboutDialog.ui: Another futile attempt to make it look correct on Linux. -FinalRenderDialog.h/cpp: --Add support for multi-GPU. --Change from combo boxes for device selection to a table of all devices. --Ensure device selection makes sense. --Remove "FinalRender" prefix of various function names, it's implied given the context. -FinalRenderEmberController.h/cpp: --Add support for multi-GPU. --Change m_FinishedImageCount to be atomic. --Move CancelRender() from the base to FinalRenderEmberController. --Refactor RenderComplete() to omit any progress related functionality or image saving since it can be potentially ran in a thread. --Consolidate setting various renderer fields into SyncGuiToRenderer(). -Fractorium.cpp: Allow for resizing of the options dialog to show the entire device table. -FractoriumCommon.h: Add various functions to handle a table showing the available OpenCL devices on the system. -FractoriumEmberController.h/cpp: Remove m_FinalImageIndex, it's no longer needed. -FractoriumRender.cpp: Scale the interactive sub batch count and quality by the number of devices used. -FractoriumSettings.h/cpp: --Temporal samples defaults to 100 instead of 1000 which was needless overkill. --Add multi-GPU support, remove old device,platform pair. -FractoriumToolbar.cpp: Disable OpenCL toolbar button if there are no devices present on the system. -FractoriumOptionsDialog.h/cpp: --Add support for multi-GPU. --Consolidate more assignments in DataToGui(). --Enable/disable CPU/OpenCL items in response to OpenCL checkbox event. -Misc: Convert almost everything to size_t for unsigned, intmax_t for signed. --- Builds/MSVC/VS2013/Ember.vcxproj | 1 + Builds/MSVC/VS2013/EmberAnimate.vcxproj | 1 + Builds/MSVC/VS2013/EmberCL.vcxproj | 5 + Builds/MSVC/VS2013/EmberCL.vcxproj.filters | 12 + Builds/MSVC/VS2013/EmberGenome.vcxproj | 1 + Builds/MSVC/VS2013/EmberRender.vcxproj | 1 + Builds/MSVC/VS2013/EmberTester.vcxproj | 1 + Builds/MSVC/VS2013/Fractorium.vcxproj | 1 + Builds/QtCreator/Ember/Ember.pro | 1 + Builds/QtCreator/Ember/deployment.pri | 3 - Builds/QtCreator/EmberCL/EmberCL.pro | 33 +- Builds/QtCreator/build_all.sh | 4 +- Builds/QtCreator/shared_settings.pri | 18 +- Source/Ember/Affine2D.cpp | 4 +- Source/Ember/Curves.h | 14 +- Source/Ember/Ember.h | 12 +- Source/Ember/EmberMotion.h | 5 +- Source/Ember/EmberPch.h | 4 +- Source/Ember/EmberToXml.h | 24 +- Source/Ember/Isaac.h | 10 +- Source/Ember/Palette.h | 18 +- Source/Ember/PaletteList.h | 10 +- Source/Ember/Point.h | 1 + Source/Ember/Renderer.cpp | 17 +- Source/Ember/Renderer.h | 10 +- Source/Ember/RendererBase.h | 2 +- Source/Ember/SheepTools.h | 38 +- Source/Ember/Timing.h | 8 +- Source/Ember/Utils.h | 144 +-- Source/Ember/Variation.h | 4 +- Source/Ember/VariationList.h | 12 +- Source/Ember/Variations03.h | 4 +- Source/Ember/Variations04.h | 7 +- Source/Ember/Variations05.h | 22 +- Source/Ember/VariationsDC.h | 2 +- Source/Ember/XmlToEmber.h | 231 ++-- Source/EmberAnimate/EmberAnimate.cpp | 351 ++--- Source/EmberCL/EmberCLPch.h | 5 +- Source/EmberCL/IterOpenCLKernelCreator.cpp | 28 + Source/EmberCL/IterOpenCLKernelCreator.h | 19 +- Source/EmberCL/OpenCLInfo.cpp | 406 ++++++ Source/EmberCL/OpenCLInfo.h | 69 + Source/EmberCL/OpenCLWrapper.cpp | 472 ++----- Source/EmberCL/OpenCLWrapper.h | 72 +- Source/EmberCL/RendererCL.cpp | 1135 ++++++++++------- Source/EmberCL/RendererCL.h | 79 +- Source/EmberCL/RendererClDevice.cpp | 60 + Source/EmberCL/RendererClDevice.h | 42 + Source/EmberCommon/EmberCommon.h | 165 ++- Source/EmberCommon/EmberCommonPch.h | 4 +- Source/EmberCommon/EmberOptions.h | 244 ++-- Source/EmberCommon/JpegUtils.h | 4 +- Source/EmberCommon/SimpleGlob.h | 2 +- Source/EmberGenome/EmberGenome.cpp | 34 +- Source/EmberRender/EmberRender.cpp | 35 +- Source/EmberTester/EmberTester.cpp | 52 +- Source/Fractorium/AboutDialog.ui | 431 ++++--- Source/Fractorium/CurvesGraphicsView.cpp | 4 +- Source/Fractorium/DoubleSpinBox.h | 2 +- Source/Fractorium/FinalRenderDialog.cpp | 207 +-- Source/Fractorium/FinalRenderDialog.h | 21 +- Source/Fractorium/FinalRenderDialog.ui | 96 +- .../Fractorium/FinalRenderEmberController.cpp | 502 +++++--- .../Fractorium/FinalRenderEmberController.h | 28 +- Source/Fractorium/Fractorium.cpp | 10 +- Source/Fractorium/Fractorium.h | 4 +- Source/Fractorium/FractoriumCommon.h | 173 ++- .../Fractorium/FractoriumEmberController.cpp | 4 +- Source/Fractorium/FractoriumEmberController.h | 36 +- Source/Fractorium/FractoriumInfo.cpp | 2 +- Source/Fractorium/FractoriumLibrary.cpp | 2 +- Source/Fractorium/FractoriumMenus.cpp | 38 +- Source/Fractorium/FractoriumPalette.cpp | 2 +- Source/Fractorium/FractoriumRender.cpp | 76 +- Source/Fractorium/FractoriumSettings.cpp | 257 ++-- Source/Fractorium/FractoriumSettings.h | 30 +- Source/Fractorium/FractoriumToolbar.cpp | 9 +- Source/Fractorium/FractoriumXformsColor.cpp | 6 +- .../Fractorium/FractoriumXformsVariations.cpp | 10 +- Source/Fractorium/GLEmberController.cpp | 6 +- Source/Fractorium/GLWidget.cpp | 8 +- Source/Fractorium/Main.cpp | 4 + Source/Fractorium/OptionsDialog.cpp | 162 ++- Source/Fractorium/OptionsDialog.h | 7 +- Source/Fractorium/OptionsDialog.ui | 286 +++-- 85 files changed, 3869 insertions(+), 2517 deletions(-) create mode 100644 Source/EmberCL/OpenCLInfo.cpp create mode 100644 Source/EmberCL/OpenCLInfo.h create mode 100644 Source/EmberCL/RendererClDevice.cpp create mode 100644 Source/EmberCL/RendererClDevice.h diff --git a/Builds/MSVC/VS2013/Ember.vcxproj b/Builds/MSVC/VS2013/Ember.vcxproj index 25d86c3..7576e09 100644 --- a/Builds/MSVC/VS2013/Ember.vcxproj +++ b/Builds/MSVC/VS2013/Ember.vcxproj @@ -232,6 +232,7 @@ Precise true false + false Windows diff --git a/Builds/MSVC/VS2013/EmberAnimate.vcxproj b/Builds/MSVC/VS2013/EmberAnimate.vcxproj index 9b36f5a..704de76 100644 --- a/Builds/MSVC/VS2013/EmberAnimate.vcxproj +++ b/Builds/MSVC/VS2013/EmberAnimate.vcxproj @@ -250,6 +250,7 @@ xcopy /F /Y /R /D "$(SolutionDir)..\..\..\Data\flam3-palettes.xml" "$(OutDir)"true true false + false Console diff --git a/Builds/MSVC/VS2013/EmberCL.vcxproj b/Builds/MSVC/VS2013/EmberCL.vcxproj index 6787428..659f9d7 100644 --- a/Builds/MSVC/VS2013/EmberCL.vcxproj +++ b/Builds/MSVC/VS2013/EmberCL.vcxproj @@ -235,6 +235,7 @@ true false true + false Windows @@ -298,8 +299,10 @@ + + @@ -307,9 +310,11 @@ + + diff --git a/Builds/MSVC/VS2013/EmberCL.vcxproj.filters b/Builds/MSVC/VS2013/EmberCL.vcxproj.filters index 7db175f..45bf595 100644 --- a/Builds/MSVC/VS2013/EmberCL.vcxproj.filters +++ b/Builds/MSVC/VS2013/EmberCL.vcxproj.filters @@ -36,6 +36,12 @@ Kernel Creators + + Source Files + + + Source Files + @@ -62,5 +68,11 @@ Kernel Creators + + Header Files + + + Header Files + \ No newline at end of file diff --git a/Builds/MSVC/VS2013/EmberGenome.vcxproj b/Builds/MSVC/VS2013/EmberGenome.vcxproj index 92d2585..e99eb28 100644 --- a/Builds/MSVC/VS2013/EmberGenome.vcxproj +++ b/Builds/MSVC/VS2013/EmberGenome.vcxproj @@ -250,6 +250,7 @@ xcopy /F /Y /R /D "$(SolutionDir)..\..\..\Data\flam3-palettes.xml" "$(OutDir)"true true false + false Console diff --git a/Builds/MSVC/VS2013/EmberRender.vcxproj b/Builds/MSVC/VS2013/EmberRender.vcxproj index 7b32cf8..36d597f 100644 --- a/Builds/MSVC/VS2013/EmberRender.vcxproj +++ b/Builds/MSVC/VS2013/EmberRender.vcxproj @@ -251,6 +251,7 @@ xcopy /F /Y /R /D "$(SolutionDir)..\..\..\Data\flam3-palettes.xml" "$(OutDir)"true false true + false Console diff --git a/Builds/MSVC/VS2013/EmberTester.vcxproj b/Builds/MSVC/VS2013/EmberTester.vcxproj index b51d7c5..fdf9396 100644 --- a/Builds/MSVC/VS2013/EmberTester.vcxproj +++ b/Builds/MSVC/VS2013/EmberTester.vcxproj @@ -250,6 +250,7 @@ xcopy /F /Y /R /D "$(SolutionDir)..\..\..\Data\flam3-palettes.xml" "$(OutDir)"true true false + false Console diff --git a/Builds/MSVC/VS2013/Fractorium.vcxproj b/Builds/MSVC/VS2013/Fractorium.vcxproj index 5170ca0..1656e9b 100644 --- a/Builds/MSVC/VS2013/Fractorium.vcxproj +++ b/Builds/MSVC/VS2013/Fractorium.vcxproj @@ -246,6 +246,7 @@ xcopy /F /Y /R /D "$(SolutionDir)..\..\..\Data\flam3-palettes.xml" "$(OutDir)"true false /bigobj -Zm150 %(AdditionalOptions) + false Windows diff --git a/Builds/QtCreator/Ember/Ember.pro b/Builds/QtCreator/Ember/Ember.pro index f4da780..df6ed2f 100644 --- a/Builds/QtCreator/Ember/Ember.pro +++ b/Builds/QtCreator/Ember/Ember.pro @@ -1,4 +1,5 @@ TEMPLATE = lib +CONFIG += plugin CONFIG += shared CONFIG -= app_bundle CONFIG -= qt diff --git a/Builds/QtCreator/Ember/deployment.pri b/Builds/QtCreator/Ember/deployment.pri index f9403df..6376a82 100644 --- a/Builds/QtCreator/Ember/deployment.pri +++ b/Builds/QtCreator/Ember/deployment.pri @@ -188,6 +188,3 @@ export (DEPLOYMENT) export (LIBS) export (QMAKE_EXTRA_TARGETS) } - -OTHER_FILES += - diff --git a/Builds/QtCreator/EmberCL/EmberCL.pro b/Builds/QtCreator/EmberCL/EmberCL.pro index d0dbb39..24277d3 100644 --- a/Builds/QtCreator/EmberCL/EmberCL.pro +++ b/Builds/QtCreator/EmberCL/EmberCL.pro @@ -1,4 +1,5 @@ TEMPLATE = lib +CONFIG += plugin CONFIG += shared CONFIG -= app_bundle CONFIG -= qt @@ -14,23 +15,27 @@ QMAKE_CXXFLAGS += -D_CONSOLE QMAKE_CXXFLAGS += -BUILDING_EMBERCL SOURCES += \ - ../../../Source/EmberCL/DllMain.cpp \ - ../../../Source/EmberCL/FinalAccumOpenCLKernelCreator.cpp \ - ../../../Source/EmberCL/IterOpenCLKernelCreator.cpp \ - ../../../Source/EmberCL/OpenCLWrapper.cpp \ - ../../../Source/EmberCL/RendererCL.cpp \ - ../../../Source/EmberCL/DEOpenCLKernelCreator.cpp + ../../../Source/EmberCL/DllMain.cpp \ + ../../../Source/EmberCL/DEOpenCLKernelCreator.cpp \ + ../../../Source/EmberCL/FinalAccumOpenCLKernelCreator.cpp \ + ../../../Source/EmberCL/IterOpenCLKernelCreator.cpp \ + ../../../Source/EmberCL/OpenCLInfo.cpp \ + ../../../Source/EmberCL/OpenCLWrapper.cpp \ + ../../../Source/EmberCL/RendererCL.cpp \ + ../../../Source/EmberCL/RendererCLDevice.cpp include(deployment.pri) qtcAddDeployment() HEADERS += \ - ../../../Source/EmberCL/DEOpenCLKernelCreator.h \ - ../../../Source/EmberCL/EmberCLFunctions.h \ - ../../../Source/EmberCL/EmberCLPch.h \ - ../../../Source/EmberCL/EmberCLStructs.h \ - ../../../Source/EmberCL/FinalAccumOpenCLKernelCreator.h \ - ../../../Source/EmberCL/IterOpenCLKernelCreator.h \ - ../../../Source/EmberCL/OpenCLWrapper.h \ - ../../../Source/EmberCL/RendererCL.h + ../../../Source/EmberCL/DEOpenCLKernelCreator.h \ + ../../../Source/EmberCL/EmberCLFunctions.h \ + ../../../Source/EmberCL/EmberCLPch.h \ + ../../../Source/EmberCL/EmberCLStructs.h \ + ../../../Source/EmberCL/FinalAccumOpenCLKernelCreator.h \ + ../../../Source/EmberCL/IterOpenCLKernelCreator.h \ + ../../../Source/EmberCL/OpenCLInfo.h \ + ../../../Source/EmberCL/OpenCLWrapper.h \ + ../../../Source/EmberCL/RendererCL.h \ + ../../../Source/EmberCL/RendererCLDevice.h diff --git a/Builds/QtCreator/build_all.sh b/Builds/QtCreator/build_all.sh index a71eb9e..11af667 100644 --- a/Builds/QtCreator/build_all.sh +++ b/Builds/QtCreator/build_all.sh @@ -3,8 +3,8 @@ REBUILD='' NVIDIA='' NATIVE='' -CONCURRENCY='-j9' -QMAKE=${QMAKE:-qmake} +CONCURRENCY='-j4' +QMAKE=${QMAKE:/usr/bin/qmake} RELEASE='CONFIG+=release CONFIG-=debug' while test $# -gt 0 diff --git a/Builds/QtCreator/shared_settings.pri b/Builds/QtCreator/shared_settings.pri index b5dc4ce..e2493ed 100644 --- a/Builds/QtCreator/shared_settings.pri +++ b/Builds/QtCreator/shared_settings.pri @@ -1,5 +1,15 @@ -CONFIG += warn_off -VERSION = 0.1.4.7 +VERSION = 0.1.4.9 + +CONFIG(release, debug|release) { + CONFIG += warn_off + DESTDIR = ../../../Bin/release +} + +CONFIG(debug, debug|release) { + DESTDIR = ../../../Bin/debug +} + +QMAKE_POST_LINK += $$quote(cp --update ../../../Data/flam3-palettes.xml $${DESTDIR}$$escape_expand(\n\t)) macx { LIBS += -framework OpenGL @@ -33,9 +43,6 @@ native { QMAKE_CXXFLAGS += -march=k8 } -release:DESTDIR = ../../../release -debug:DESTDIR = ../../../debug - OBJECTS_DIR = $$DESTDIR/.obj MOC_DIR = $$DESTDIR/.moc RCC_DIR = $$DESTDIR/.qrc @@ -44,6 +51,7 @@ UI_DIR = $$DESTDIR/.ui LIBS += -L/usr/lib -ljpeg LIBS += -L/usr/lib -lpng LIBS += -L/usr/lib -ltbb +LIBS += -L/usr/lib -lpthread LIBS += -L/usr/lib/x86_64-linux-gnu -lxml2 CMAKE_CXXFLAGS += -DCL_USE_DEPRECATED_OPENCL_1_1_APIS diff --git a/Source/Ember/Affine2D.cpp b/Source/Ember/Affine2D.cpp index b02839a..1d15e3c 100644 --- a/Source/Ember/Affine2D.cpp +++ b/Source/Ember/Affine2D.cpp @@ -293,7 +293,7 @@ template typename m4T Affine2D::ToMat4ColMajor(bool center) const { m4T mat(A(), B(), 0, center ? 0 : C(), //Col0... - D(), E(), 0, center ? 0 : F(), //1 + D(), E(), 0, center ? 0 : F(), //1 0, 0, 1, 0, //2 0, 0, 0, 1);//3 @@ -309,7 +309,7 @@ template typename m4T Affine2D::ToMat4RowMajor(bool center) const { m4T mat(A(), D(), 0, 0, - B(), E(), 0, 0, + B(), E(), 0, 0, 0, 0, 1, 0, center ? 0 : C(), center ? 0 : F(), 0, 1); diff --git a/Source/Ember/Curves.h b/Source/Ember/Curves.h index 391ad95..0323dba 100644 --- a/Source/Ember/Curves.h +++ b/Source/Ember/Curves.h @@ -69,7 +69,7 @@ public: template Curves& operator = (const Curves& curves) { - for (uint i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) { m_Points[i][0].x = T(curves.m_Points[i][0].x); m_Points[i][0].y = T(curves.m_Points[i][0].y); m_Weights[i].x = T(curves.m_Weights[i].x); m_Points[i][1].x = T(curves.m_Points[i][1].x); m_Points[i][1].y = T(curves.m_Points[i][1].y); m_Weights[i].y = T(curves.m_Weights[i].y); @@ -87,7 +87,7 @@ public: /// Reference to updated self Curves& operator += (const Curves& curves) { - for (uint i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) { m_Points[i][0] += curves.m_Points[i][0]; m_Points[i][1] += curves.m_Points[i][1]; @@ -107,7 +107,7 @@ public: /// Reference to updated self Curves& operator *= (const Curves& curves) { - for (uint i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) { m_Points[i][0] *= curves.m_Points[i][0]; m_Points[i][1] *= curves.m_Points[i][1]; @@ -127,7 +127,7 @@ public: /// Reference to updated self Curves& operator *= (const T& t) { - for (uint i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) { m_Points[i][0] *= t; m_Points[i][1] *= t; @@ -145,7 +145,7 @@ public: /// void Init() { - for (uint i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) { m_Points[i][0] = v2T(0);//0,0 -> 0,0 -> 1,1 -> 1,1. m_Points[i][1] = v2T(0); @@ -173,7 +173,7 @@ public: { bool set = false; - for (uint i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) { if ((m_Points[i][0] != v2T(0)) || (m_Points[i][1] != v2T(0)) || @@ -257,7 +257,7 @@ Curves operator * (const Curves& curves, const T& t) { Curves c(curves); - for (uint i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) { c.m_Points[i][0] *= t; c.m_Points[i][1] *= t; diff --git a/Source/Ember/Ember.h b/Source/Ember/Ember.h index ccaacaf..abb7238 100644 --- a/Source/Ember/Ember.h +++ b/Source/Ember/Ember.h @@ -1075,14 +1075,15 @@ public: /// /// The type of symmetry to add /// The random context to use for generating random symmetry - void AddSymmetry(int sym, QTIsaac& rand) + void AddSymmetry(intmax_t sym, QTIsaac& rand) { - size_t i, k, result = 0; + intmax_t k; + size_t i, result = 0; T a; if (sym == 0) { - static int symDistrib[] = { + static intmax_t symDistrib[] = { -4, -3, -2, -2, -2, -1, -1, -1, @@ -1356,6 +1357,9 @@ public: case FLAME_MOTION_VIBRANCY: APP_FMP(m_Vibrancy); break; + case FLAME_MOTION_NONE: + default: + break; } } } @@ -1581,7 +1585,7 @@ public: //Whether or not any symmetry was added. This field is in a bit of a state of conflict right now as flam3 has a severe bug. //Xml field: "symmetry". - int m_Symmetry; + intmax_t m_Symmetry; //The number of iterations per pixel of the final output image. Note this is not affected by the increase in pixels in the //histogram and DE filtering buffer due to supersampling. It can be affected by a non-zero zoom value though. diff --git a/Source/Ember/EmberMotion.h b/Source/Ember/EmberMotion.h index f3ae001..f672ec1 100644 --- a/Source/Ember/EmberMotion.h +++ b/Source/Ember/EmberMotion.h @@ -36,6 +36,7 @@ public: /// /// The MotionParam object to copy MotionParam(const MotionParam& other) + : pair () { operator=(other); } @@ -70,8 +71,8 @@ public: template MotionParam &operator = (const MotionParam& other) { - this->first = other.first; - this->second = T(other.second); + this->first = other.first; + this->second = T(other.second); return *this; } diff --git a/Source/Ember/EmberPch.h b/Source/Ember/EmberPch.h index 650afbe..3952f53 100644 --- a/Source/Ember/EmberPch.h +++ b/Source/Ember/EmberPch.h @@ -1,4 +1,6 @@ -#pragma once +#ifdef WIN32 + #pragma once +#endif /// /// Precompiled header file. Place all system includes here with appropriate #defines for different operating systems and compilers. diff --git a/Source/Ember/EmberToXml.h b/Source/Ember/EmberToXml.h index 37a0301..b288126 100644 --- a/Source/Ember/EmberToXml.h +++ b/Source/Ember/EmberToXml.h @@ -104,13 +104,19 @@ public: b = false; } } - catch (...) + catch (const std::exception& e) { - if (f.is_open()) - f.close(); - + cout << "Error: Writing flame " << filename << " failed: " << e.what() << endl; b = false; } + catch (...) + { + cout << "Error: Writing flame " << filename << " failed." << endl; + b = false; + } + + if (f.is_open()) + f.close(); return b; } @@ -125,7 +131,7 @@ public: /// If true use integers instead of floating point numbers when embedding a non-hex formatted palette, else use floating point numbers. /// If true, embed a hexadecimal palette instead of Xml Color tags, else use Xml color tags. /// The Xml string representation of the passed in ember - string ToString(Ember& ember, string extraAttributes, size_t printEditDepth, bool doEdits, bool intPalette, bool hexPalette = true) + string ToString(Ember& ember, const string& extraAttributes, size_t printEditDepth, bool doEdits, bool intPalette, bool hexPalette = true) { size_t i, j; string s; @@ -313,7 +319,7 @@ public: /// The sheep generation used if > 0. Default: 0. /// The sheep id used if > 0. Default: 0. /// - xmlDocPtr CreateNewEditdoc(Ember* parent0, Ember* parent1, string action, string nick, string url, string id, string comment, int sheepGen = 0, int sheepId = 0) + xmlDocPtr CreateNewEditdoc(Ember* parent0, Ember* parent1, const string& action, const string& nick, const string& url, const string& id, const string& comment, intmax_t sheepGen = 0, intmax_t sheepId = 0) { char timeString[128]; time_t myTime; @@ -750,6 +756,7 @@ private: case MOTION_TRIANGLE: os << "\"triangle\""; break; + default: case MOTION_SAW: os << "\"saw\""; break; @@ -761,7 +768,7 @@ private: T cx = 0.0; T cy = 0.0; - for (int i = 0; i < motion.m_MotionParams.size(); ++i) + for (size_t i = 0; i < motion.m_MotionParams.size(); ++i) { switch(motion.m_MotionParams[i].first) { @@ -816,6 +823,9 @@ private: case FLAME_MOTION_VIBRANCY: os << " vibrancy=\"" << motion.m_MotionParams[i].second << "\""; break; + case FLAME_MOTION_NONE: + default: + break; } } diff --git a/Source/Ember/Isaac.h b/Source/Ember/Isaac.h index 2e4de38..9da06ee 100644 --- a/Source/Ember/Isaac.h +++ b/Source/Ember/Isaac.h @@ -57,7 +57,7 @@ class EMBER_API QTIsaac public: enum { N = (1 << ALPHA) }; UintBytes m_Cache; - int m_LastIndex; + size_t m_LastIndex; /// /// Global ISAAC RNG to be used from anywhere. This is not thread safe, so take caution to only @@ -94,7 +94,7 @@ public: Srand(a, b, c, s); m_LastIndex = 0; m_Cache.Uint = Rand(); - T temp = RandByte();//Need to call at least once so other libraries can link. + RandByte();//Need to call at least once so other libraries can link. } /// @@ -291,12 +291,12 @@ public: { if (s == nullptr)//Default to using time plus index as the seed if s was nullptr. { - for (int i = 0; i < N; i++) - m_Rc.randrsl[i] = static_cast(NowMs()) + i; + for (size_t i = 0; i < N; i++) + m_Rc.randrsl[i] = static_cast(NowMs() + i); } else { - for (int i = 0; i < N; i++) + for (size_t i = 0; i < N; i++) m_Rc.randrsl[i] = s[i]; } diff --git a/Source/Ember/Palette.h b/Source/Ember/Palette.h index f339649..713ca3e 100644 --- a/Source/Ember/Palette.h +++ b/Source/Ember/Palette.h @@ -39,7 +39,7 @@ public: /// The index in the palette file /// The size of the palette which should be 256 /// A pointer to 256 color entries - Palette(const string& name, int index, uint size, v4T* xmlPaletteEntries) + Palette(const string& name, int index, size_t size, v4T* xmlPaletteEntries) { m_Name = name; m_Index = index; @@ -86,7 +86,7 @@ public: 0x00, 0x81, 0x96, 0x8d, 0x00, 0x81, 0x9a, 0x8d, 0x00, 0x85, 0x9a, 0x8d, 0x00, 0x89, 0x9e, 0x8d, 0x00, 0x89, 0x9e, 0x8d, 0x00, 0x8d, 0xa2, 0x97, 0x00, 0x95, 0xa2, 0x97, 0x00, 0x8d, 0xa2, 0x97, 0x00, 0x96, 0xa6, 0x8d, 0x00, 0x9a, 0xa1, 0x8d, 0x00, 0x9e, 0xa9, 0x84, 0x00, 0x9e, 0xa6, 0x7a, 0x00, 0xa2, 0xa5, 0x71, 0x00, 0x9e, 0xa6, 0x71, 0x00, 0x9a, 0xa6, 0x71, 0x00, 0x95, 0x9d, 0x71 }; - for (uint i = 0; i < size; i++) + for (size_t i = 0; i < size; i++) { m_Entries[i].a = T(palette15[i * 4 + 0]); m_Entries[i].r = T(palette15[i * 4 + 1]); @@ -208,7 +208,7 @@ public: palette.m_Filename = m_Filename; palette.m_Entries.resize(Size()); - for (uint i = 0; i < Size(); i++) + for (size_t i = 0; i < Size(); i++) { size_t ii = (i * 256) / COLORMAP_LENGTH; T rgb[3], hsv[3]; @@ -348,7 +348,7 @@ public: if (palette.Size() != Size()) palette.m_Entries.resize(Size()); - for (uint j = 0; j < palette.Size(); j++) + for (size_t j = 0; j < palette.Size(); j++) { palette.m_Entries[j] = m_Entries[j] * colorScalar; palette.m_Entries[j].a = 1; @@ -362,16 +362,16 @@ public: /// /// The height of the output block /// A vector holding the color values - vector MakeRgbPaletteBlock(uint height) + vector MakeRgbPaletteBlock(size_t height) { size_t width = Size(); vector v(height * width * 3); if (v.size() == (height * Size() * 3)) { - for (uint i = 0; i < height; i++) + for (size_t i = 0; i < height; i++) { - for (uint j = 0; j < width; j++) + for (size_t j = 0; j < width; j++) { v[(width * 3 * i) + (j * 3)] = byte(m_Entries[j][0] * T(255));//Palettes are as [0..1], so convert to [0..255] here since it's for GUI display. v[(width * 3 * i) + (j * 3) + 1] = byte(m_Entries[j][1] * T(255)); @@ -443,7 +443,7 @@ public: /// Blue 0 - 1 static void HsvToRgb(T h, T s, T v, T& r, T& g, T& b) { - int j; + intmax_t j; T f, p, q, t; while (h >= 6) @@ -522,7 +522,7 @@ public: template static void CalcNewRgb(bucketT* cBuf, T ls, T highPow, bucketT* newRgb) { - int rgbi; + size_t rgbi; T newls, lsratio; bucketT newhsv[3]; T maxa, maxc; diff --git a/Source/Ember/PaletteList.h b/Source/Ember/PaletteList.h index 37070e5..8a8d7a5 100644 --- a/Source/Ember/PaletteList.h +++ b/Source/Ember/PaletteList.h @@ -80,7 +80,7 @@ public: Palette* GetRandomPalette() { auto p = m_Palettes.begin(); - int i = 0, paletteFileIndex = QTIsaac::GlobalRand->Rand() % Size(); + size_t i = 0, paletteFileIndex = QTIsaac::GlobalRand->Rand() % Size(); //Move p forward i elements. while (i < paletteFileIndex && p != m_Palettes.end()) @@ -91,7 +91,7 @@ public: if (i < Size()) { - int paletteIndex = QTIsaac::GlobalRand->Rand() % p->second.size(); + size_t paletteIndex = QTIsaac::GlobalRand->Rand() % p->second.size(); if (paletteIndex < p->second.size()) return &p->second[paletteIndex]; @@ -106,11 +106,11 @@ public: /// The filename of the palette to retrieve /// The index of the palette to read. A value of -1 indicates a random palette. /// A pointer to the requested palette if the index was in range, else nullptr. - Palette* GetPalette(const string& filename, int i) + Palette* GetPalette(const string& filename, size_t i) { auto& palettes = m_Palettes[filename]; - if (!palettes.empty() && i < int(palettes.size())) + if (!palettes.empty() && i < palettes.size()) return &palettes[i]; return nullptr; @@ -142,7 +142,7 @@ public: /// The hue adjustment to apply /// The palette to store the output /// True if successful, else false. - bool GetHueAdjustedPalette(const string& filename, int i, T hue, Palette& palette) + bool GetHueAdjustedPalette(const string& filename, size_t i, T hue, Palette& palette) { bool b = false; diff --git a/Source/Ember/Point.h b/Source/Ember/Point.h index a60cd62..d4e8274 100644 --- a/Source/Ember/Point.h +++ b/Source/Ember/Point.h @@ -151,6 +151,7 @@ public: /// /// The Color object to copy Color(const Color& color) + : v4T() { Color::operator=(color); } diff --git a/Source/Ember/Renderer.cpp b/Source/Ember/Renderer.cpp index 294f63f..50ba671 100644 --- a/Source/Ember/Renderer.cpp +++ b/Source/Ember/Renderer.cpp @@ -671,7 +671,7 @@ Finish: /// If true, embed a hexadecimal palette instead of Xml Color tags, else use Xml color tags. /// The EmberImageComments object with image comments filled out template -EmberImageComments Renderer::ImageComments(EmberStats& stats, size_t printEditDepth, bool intPalette, bool hexPalette) +EmberImageComments Renderer::ImageComments(const EmberStats& stats, size_t printEditDepth, bool intPalette, bool hexPalette) { ostringstream ss; EmberImageComments comments; @@ -708,7 +708,7 @@ void Renderer::MakeDmap(T colorScalar) /// /// True if success, else false template -bool Renderer::Alloc() +bool Renderer::Alloc(bool histOnly) { bool b = true; bool lock = @@ -730,6 +730,14 @@ bool Renderer::Alloc() b &= (m_HistBuckets.size() == m_SuperSize); } + if (histOnly) + { + if (lock) + LeaveResize(); + + return b; + } + if (m_SuperSize != m_AccumulatorBuckets.size()) { m_AccumulatorBuckets.resize(m_SuperSize); @@ -1224,7 +1232,7 @@ EmberStats Renderer::Iterate(size_t iterCount, size_t temporalSample sp.sched_priority = m_Priority; pthread_setschedparam(pthread_self(), SCHED_RR, &sp); #else - pthread_setschedprio(pthread_self(), (int)m_Priority); + pthread_setschedprio(pthread_self(), int(m_Priority)); #endif //Timing t; IterParams params; @@ -1268,7 +1276,6 @@ EmberStats Renderer::Iterate(size_t iterCount, size_t temporalSample if (m_Callback && threadIndex == 0) { percent = 100.0 * - double ( double @@ -1342,7 +1349,7 @@ template T Renderer::PixelsP template T Renderer::PixelsPerUnitY() const { return m_PixelsPerUnitY; } template bucketT Renderer::K1() const { return m_K1; } template bucketT Renderer::K2() const { return m_K2; } -template const CarToRas* Renderer::CoordMap() const { return &m_CarToRas; } +template const CarToRas& Renderer::CoordMap() const { return m_CarToRas; } template tvec4* Renderer::HistBuckets() { return m_HistBuckets.data(); } template tvec4* Renderer::AccumulatorBuckets() { return m_AccumulatorBuckets.data(); } template SpatialFilter* Renderer::GetSpatialFilter() { return m_SpatialFilter.get(); } diff --git a/Source/Ember/Renderer.h b/Source/Ember/Renderer.h index c0ded37..6c33fb8 100644 --- a/Source/Ember/Renderer.h +++ b/Source/Ember/Renderer.h @@ -38,9 +38,7 @@ namespace EmberNs /// for every single function in this class, saying it can't find the implementation. This warning /// can be safely ignored. /// Template argument T expected to be float or double. -/// Template argument bucketT was originally used to experiment with different types for the histogram, however -/// the only types that work are float and double, so it's useless and should always match what T is. -/// Mismatched types between T and bucketT are undefined. +/// Template argument bucketT must always be float. /// template class EMBER_API Renderer : public RendererBase @@ -65,12 +63,12 @@ public: virtual bool CreateTemporalFilter(bool& newAlloc) override; virtual size_t HistBucketSize() const override { return sizeof(tvec4); } virtual eRenderStatus Run(vector& finalImage, double time = 0, size_t subBatchCountOverride = 0, bool forceOutput = false, size_t finalOffset = 0) override; - virtual EmberImageComments ImageComments(EmberStats& stats, size_t printEditDepth = 0, bool intPalette = false, bool hexPalette = true) override; + virtual EmberImageComments ImageComments(const EmberStats& stats, size_t printEditDepth = 0, bool intPalette = false, bool hexPalette = true) override; protected: //New virtual functions to be overridden in derived renderers that use the GPU, but not accessed outside. virtual void MakeDmap(T colorScalar); - virtual bool Alloc(); + virtual bool Alloc(bool histOnly = false); virtual bool ResetBuckets(bool resetHist = true, bool resetAccum = true); virtual eRenderStatus LogScaleDensityFilter(); virtual eRenderStatus GaussianDensityFilter(); @@ -89,7 +87,7 @@ public: inline T PixelsPerUnitY() const; inline bucketT K1() const; inline bucketT K2() const; - inline const CarToRas* CoordMap() const; + inline const CarToRas& CoordMap() const; inline tvec4* HistBuckets(); inline tvec4* AccumulatorBuckets(); inline SpatialFilter* GetSpatialFilter(); diff --git a/Source/Ember/RendererBase.h b/Source/Ember/RendererBase.h index 670cb32..7c45898 100644 --- a/Source/Ember/RendererBase.h +++ b/Source/Ember/RendererBase.h @@ -120,7 +120,7 @@ public: virtual void ComputeQuality() = 0; virtual void ComputeCamera() = 0; virtual eRenderStatus Run(vector& finalImage, double time = 0, size_t subBatchCountOverride = 0, bool forceOutput = false, size_t finalOffset = 0) = 0; - virtual EmberImageComments ImageComments(EmberStats& stats, size_t printEditDepth = 0, bool intPalette = false, bool hexPalette = true) = 0; + virtual EmberImageComments ImageComments(const EmberStats& stats, size_t printEditDepth = 0, bool intPalette = false, bool hexPalette = true) = 0; virtual DensityFilterBase* GetDensityFilter() = 0; //Non-virtual renderer properties, getters only. diff --git a/Source/Ember/SheepTools.h b/Source/Ember/SheepTools.h index d9472eb..a9b12f0 100644 --- a/Source/Ember/SheepTools.h +++ b/Source/Ember/SheepTools.h @@ -44,7 +44,7 @@ enum eCrossMode /// Most functions in this class perform a particular action and return /// a string describing what it did so it can be recorded in an Xml edit doc /// to be saved with the ember when converting to Xml. -/// Since it's members can occupy significant memory space and also have +/// Since its members can occupy significant memory space and also have /// hefty initialization sequences, it's important to declare one instance /// and reuse it for the duration of the program instead of creating and deleting /// them as local variables. @@ -186,11 +186,10 @@ public: /// The type of symmetry to add if random specified. If 0, it will be added randomly. /// The speed to multiply the pre affine transforms by if the mutate mode is MUTATE_ALL_COEFS, else ignored. /// A string describing what was done - string Mutate(Ember& ember, eMutateMode mode, vector& useVars, int sym, T speed) + string Mutate(Ember& ember, eMutateMode mode, vector& useVars, intmax_t sym, T speed) { bool done = false; size_t modXform; - char ministr[32]; T randSelect; ostringstream os; Ember mutation; @@ -285,14 +284,13 @@ public: else if (mode == MUTATE_POST_XFORMS) { bool same = (m_Rand.Rand() & 3) > 0;//25% chance of using the same post for all of them. - uint b = 1 + m_Rand.Rand() % 6; + size_t b = 1 + m_Rand.Rand() % 6; - sprintf_s(ministr, 32, "(%d%s)", b, same ? " same" : ""); - os << "mutate post xforms " << ministr; + os << "mutate post xforms " << b << (same ? " same" : ""); for (size_t i = 0; i < ember.TotalXformCount(); i++) { - int copy = (i > 0) && same; + bool copy = (i > 0) && same; Xform* xform = ember.GetTotalXform(i); if (copy)//Copy the post from the first xform to the rest of them. @@ -605,7 +603,7 @@ public: { vector useVars; - Random(ember, useVars, static_cast(m_Rand.Frand(-2, 2)), 0); + Random(ember, useVars, static_cast(m_Rand.Frand(-2, 2)), 0); } /// @@ -615,7 +613,7 @@ public: /// A list of variations to use. If empty, any variation can be used. /// The symmetry type to use from -2 to 2 /// The number of xforms to use. If 0, a quasi random count is used. - void Random(Ember& ember, vector& useVars, int sym, size_t specXforms) + void Random(Ember& ember, vector& useVars, intmax_t sym, size_t specXforms) { bool postid, addfinal = false; int var, samed, multid, samepost; @@ -806,9 +804,9 @@ public: /// The number of test renders to try before giving up /// Change palette if true, else keep trying with the same palette. /// The resolution of the test histogram. This value ^ 3 will be used for the total size. Common value is 10. - void ImproveColors(Ember& ember, int tries, bool changePalette, int colorResolution) + void ImproveColors(Ember& ember, size_t tries, bool changePalette, size_t colorResolution) { - int i; + size_t i; T best, b; Ember bestEmber = ember; @@ -847,7 +845,7 @@ public: /// The ember to render /// The resolution of the test histogram. This value ^ 3 will be used for the total size. Common value is 10. /// The number of histogram cells that weren't black - T TryColors(Ember& ember, int colorResolution) + T TryColors(Ember& ember, size_t colorResolution) { byte* p; size_t i, hits = 0, res = colorResolution; @@ -903,7 +901,7 @@ public: } /// - /// Change around color coordinates. Optionall change out the entire palette. + /// Change around color coordinates. Optionally change out the entire palette. /// /// The ember whose xform's color coordinates will be changed /// Change palette if true, else don't @@ -1071,7 +1069,7 @@ public: /// The result of the spin /// The frame in the sequence to be stored in the m_Time member of result /// The interpolation time - void Spin(Ember& parent, Ember* templ, Ember& result, int frame, T blend) + void Spin(Ember& parent, Ember* templ, Ember& result, size_t frame, T blend) { char temp[50]; @@ -1112,7 +1110,7 @@ public: /// The frame in the sequence to be stored in the m_Time member of result /// True if embers points to the first or last ember in the entire sequence, else false. /// The interpolation time - void SpinInter(Ember* parents, Ember* templ, Ember& result, int frame, bool seqFlag, T blend) + void SpinInter(Ember* parents, Ember* templ, Ember& result, size_t frame, bool seqFlag, T blend) { char temp[50]; @@ -1286,8 +1284,8 @@ public: m_Samples.resize(samples); params.m_Count = samples; params.m_Skip = 20; - //params.m_OneColDiv2 = m_Renderer->CoordMap()->OneCol() / 2; - //params.m_OneRowDiv2 = m_Renderer->CoordMap()->OneRow() / 2; + //params.m_OneColDiv2 = m_Renderer->CoordMap().OneCol() / 2; + //params.m_OneRowDiv2 = m_Renderer->CoordMap().OneRow() / 2; size_t bv = m_Iterator->Iterate(ember, params, m_Samples.data(), m_Rand);//Use a special fuse of 20, all other calls to this will use 15, or 100. @@ -1349,7 +1347,7 @@ public: /// The comment to include /// The sheep generation used if > 0. Default: 0. /// The sheep id used if > 0. Default: 0. - void SetSpinParams(bool smooth, T stagger, T offsetX, T offsetY, string nick, string url, string id, string comment, int sheepGen, int sheepId) + void SetSpinParams(bool smooth, T stagger, T offsetX, T offsetY, const string& nick, const string& url, const string& id, const string& comment, intmax_t sheepGen, intmax_t sheepId) { m_Smooth = smooth; m_SheepGen = sheepGen; @@ -1365,8 +1363,8 @@ public: private: bool m_Smooth; - int m_SheepGen; - int m_SheepId; + intmax_t m_SheepGen; + intmax_t m_SheepId; T m_Stagger; T m_OffsetX; T m_OffsetY; diff --git a/Source/Ember/Timing.h b/Source/Ember/Timing.h index ec829aa..bdd5fdf 100644 --- a/Source/Ember/Timing.h +++ b/Source/Ember/Timing.h @@ -62,19 +62,19 @@ public: /// Return the begin time as a double. /// /// - double BeginTime() { return static_cast(m_BeginTime.time_since_epoch().count()); } + double BeginTime() const { return static_cast(m_BeginTime.time_since_epoch().count()); } /// /// Return the end time as a double. /// /// - double EndTime() { return static_cast(m_EndTime.time_since_epoch().count()); } + double EndTime() const { return static_cast(m_EndTime.time_since_epoch().count()); } /// /// Return the elapsed time in milliseconds. /// /// The elapsed time in milliseconds as a double - double ElapsedTime() + double ElapsedTime() const { duration elapsed = duration_cast(m_EndTime - m_BeginTime); @@ -89,7 +89,7 @@ public: /// /// The ms /// The formatted string - string Format(double ms) + string Format(double ms) const { stringstream ss; diff --git a/Source/Ember/Utils.h b/Source/Ember/Utils.h index 5af5f82..b225c1a 100644 --- a/Source/Ember/Utils.h +++ b/Source/Ember/Utils.h @@ -213,13 +213,19 @@ static bool ReadFile(const char* filename, string& buf, bool nullTerminate = tru fclose(f); } } - catch (...) + catch (const std::exception& e) { - if (f != nullptr) - fclose(f); - + cout << "Error: Reading file " << filename << " failed: " << e.what() << endl; b = false; } + catch (...) + { + cout << "Error: Reading file " << filename << " failed." << endl; + b = false; + } + + if (f != nullptr) + fclose(f); return b; } @@ -270,7 +276,7 @@ static void CopyVec(vector& dest, const vector& source, std::function static void ClearVec(vector& vec, bool arrayDelete = false) { - for (uint i = 0; i < vec.size(); i++) + for (size_t i = 0; i < vec.size(); i++) { if (vec[i] != nullptr) { @@ -286,6 +292,32 @@ static void ClearVec(vector& vec, bool arrayDelete = false) vec.clear(); } +/// +/// Determine whether all elements in two containers are equal. +/// +/// The first collection to compare +/// The second collection to compare +/// True if the sizes and all elements in both collections are equal, else false. +template +static bool Equal(const T& c1, const T& c2) +{ + bool equal = c1.size() == c2.size(); + + if (equal) + { + for (auto it1 = c1.begin(), it2 = c2.begin(); it1 != c1.end(); ++it1, ++it2) + { + if (*it1 != *it2) + { + equal = false; + break; + } + } + } + + return equal; +} + /// /// Thin wrapper around passing a vector to memset() to relieve /// the caller of having to pass the size. @@ -305,15 +337,15 @@ static inline void Memset(vector& vec, int val = 0) /// The value to return the floor of /// The floored value template -static inline int Floor(T val) +static inline intmax_t Floor(T val) { if (val >= 0) { - return static_cast(val); + return static_cast(val); } else { - int i = static_cast(val);//Truncate. + intmax_t i = static_cast(val);//Truncate. return i - (i > val);//Convert trunc to floor. } } @@ -862,25 +894,9 @@ static string GetPath(const string& filename) /// The value of the specified environment variable if found, else default template static inline T Arg(char* name, T def) -{ - T t; - return t; -} - -/// -/// Template specialization for Arg<>() with a type of int. -/// -/// The name of the environment variable to query -/// The default value to return if the environment variable was not present -/// The value of the specified environment variable if found, else default -template <> -#ifdef _WIN32 -static -#endif -int Arg(char* name, int def) { char* ch; - int returnVal; + T returnVal; #ifdef WIN32 size_t len; errno_t err = _dupenv_s(&ch, &len, name); @@ -892,7 +908,17 @@ int Arg(char* name, int def) if (err || !ch) returnVal = def; else - returnVal = atoi(ch); + { + T tempVal; + istringstream istr(ch); + + istr >> tempVal; + + if (!istr.bad() && !istr.fail()) + returnVal = tempVal; + else + returnVal = def; + } #ifdef WIN32 free(ch); @@ -900,21 +926,6 @@ int Arg(char* name, int def) return returnVal; } -/// -/// Template specialization for Arg<>() with a type of uint. -/// -/// The name of the environment variable to query -/// The default value to return if the environment variable was not present -/// The value of the specified environment variable if found, else default -template <> -#ifdef _WIN32 -static -#endif -uint Arg(char* name, uint def) -{ - return Arg(name, static_cast(def)); -} - /// /// Template specialization for Arg<>() with a type of bool. /// @@ -930,39 +941,6 @@ bool Arg(char* name, bool def) return (Arg(name, -999) != -999) ? true : def; } -/// -/// Template specialization for Arg<>() with a type of double. -/// -/// The name of the environment variable to query -/// The default value to return if the environment variable was not present -/// The value of the specified environment variable if found, else default -template <> -#ifdef _WIN32 -static -#endif -double Arg(char* name, double def) -{ - char* ch; - double returnVal; -#ifdef WIN32 - size_t len; - errno_t err = _dupenv_s(&ch, &len, name); -#else - int err = 1; - ch = getenv(name); -#endif - - if (err || !ch) - returnVal = def; - else - returnVal = atof(ch); - -#ifdef WIN32 - free(ch); -#endif - return returnVal; -} - /// /// Template specialization for Arg<>() with a type of string. /// @@ -1031,6 +1009,24 @@ static uint FindAndReplace(T& source, const T& find, const T& replace) return replaceCount; } +/// +/// Split a string into tokens and place them in a vector. +/// +/// The string to split +/// The delimiter to split the string on +/// The split strings, each as an element in a vector. +static vector Split(const string& str, char del) +{ + string tok; + vector vec; + stringstream ss(str); + + while (getline(ss, tok, del)) + vec.push_back(tok); + + return vec; +} + /// /// Return a character pointer to a version string composed of the EMBER_OS and EMBER_VERSION values. /// diff --git a/Source/Ember/Variation.h b/Source/Ember/Variation.h index 2f44f26..4bc6aed 100644 --- a/Source/Ember/Variation.h +++ b/Source/Ember/Variation.h @@ -1428,7 +1428,7 @@ public: /// The type of the parameter /// The minimum value the parameter can be /// The maximum value the parameter can be - ParamWithName(T* param, string name, T def = 0, eParamType type = REAL, T min = TLOW, T max = TMAX) + ParamWithName(T* param, const string& name, T def = 0, eParamType type = REAL, T min = TLOW, T max = TMAX) { Init(param, name, def, type, min, max); } @@ -1481,7 +1481,7 @@ public: /// The minimum value the parameter can be /// The maximum value the parameter can be /// Whether the parameter is actually a precalculated value. Default: false. - void Init(T* param, string name, T def = 0, eParamType type = REAL, T min = TLOW, T max = TMAX, bool isPrecalc = false) + void Init(T* param, const string& name, T def = 0, eParamType type = REAL, T min = TLOW, T max = TMAX, bool isPrecalc = false) { m_Param = param; m_Def = def; diff --git a/Source/Ember/VariationList.h b/Source/Ember/VariationList.h index 9e53a53..24c8727 100644 --- a/Source/Ember/VariationList.h +++ b/Source/Ember/VariationList.h @@ -355,7 +355,7 @@ public: //Keep a list of which variations derive from ParametricVariation. //Note that these are not new copies, rather just pointers to the original instances in m_Variations. - for (uint i = 0; i < m_Variations.size(); i++) + for (size_t i = 0; i < m_Variations.size(); i++) { if (ParametricVariation* parVar = dynamic_cast*>(m_Variations[i])) m_ParametricVariations.push_back(parVar); @@ -412,7 +412,7 @@ public: /// A pointer to the variation if found, else nullptr. const Variation* GetVariation(eVariationId id) const { - for (uint i = 0; i < m_Variations.size() && m_Variations[i] != nullptr; i++) + for (size_t i = 0; i < m_Variations.size() && m_Variations[i] != nullptr; i++) if (id == m_Variations[i]->VariationId()) return m_Variations[i]; @@ -435,7 +435,7 @@ public: /// A pointer to the variation if found, else nullptr. const Variation* GetVariation(const string& name) const { - for (uint i = 0; i < m_Variations.size() && m_Variations[i] != nullptr; i++) + for (size_t i = 0; i < m_Variations.size() && m_Variations[i] != nullptr; i++) if (!_stricmp(name.c_str(), m_Variations[i]->Name().c_str())) return m_Variations[i]; @@ -466,7 +466,7 @@ public: /// The parametric variation with a matching name, else nullptr. const ParametricVariation* GetParametricVariation(const string& name) const { - for (uint i = 0; i < m_ParametricVariations.size() && m_ParametricVariations[i] != nullptr; i++) + for (size_t i = 0; i < m_ParametricVariations.size() && m_ParametricVariations[i] != nullptr; i++) if (!_stricmp(name.c_str(), m_ParametricVariations[i]->Name().c_str())) return m_ParametricVariations[i]; @@ -480,9 +480,9 @@ public: /// The index of the variation with the matching name, else -1 int GetVariationIndex(const string& name) { - for (uint i = 0; i < m_Variations.size() && m_Variations[i] != nullptr; i++) + for (size_t i = 0; i < m_Variations.size() && m_Variations[i] != nullptr; i++) if (!_stricmp(name.c_str(), m_Variations[i]->Name().c_str())) - return i; + return int(i); return -1; } diff --git a/Source/Ember/Variations03.h b/Source/Ember/Variations03.h index 90f39d5..99ba5a7 100644 --- a/Source/Ember/Variations03.h +++ b/Source/Ember/Variations03.h @@ -277,7 +277,7 @@ public: virtual void Func(IteratorHelper& helper, Point& outPoint, QTIsaac& rand) override { T r = Zeps(pow(helper.m_PrecalcSqrtSumSquares, m_Dist)); - int n = Floor(m_Power * rand.Frand01()); + intmax_t n = Floor(m_Power * rand.Frand01()); T alpha = helper.m_PrecalcAtanyx + n * M_2PI / Zeps(T(Floor(m_Power))); T sina = sin(alpha); T cosa = cos(alpha); @@ -2571,7 +2571,7 @@ public: virtual void Func(IteratorHelper& helper, Point& outPoint, QTIsaac& rand) override { - int n; + intmax_t n; T z = 4 * m_Dist / m_Power; T r = pow(helper.m_PrecalcSqrtSumSquares, z); diff --git a/Source/Ember/Variations04.h b/Source/Ember/Variations04.h index e5184d0..43faed3 100644 --- a/Source/Ember/Variations04.h +++ b/Source/Ember/Variations04.h @@ -1372,12 +1372,13 @@ public: virtual void Func(IteratorHelper& helper, Point& outPoint, QTIsaac& rand) override { - int i, j, l, k, m, m1, n, n1; + intmax_t l, k; + int i, j, m, m1, n, n1; T r, rMin, offsetX, offsetY, x0 = 0, y0 = 0, x, y; rMin = 20; - m = Floor(helper.In.x / m_Step); - n = Floor(helper.In.y / m_Step); + m = int(Floor(helper.In.x / m_Step)); + n = int(Floor(helper.In.y / m_Step)); for (i = -1; i < 2; i++) { diff --git a/Source/Ember/Variations05.h b/Source/Ember/Variations05.h index 76958f8..0f59d9c 100644 --- a/Source/Ember/Variations05.h +++ b/Source/Ember/Variations05.h @@ -95,8 +95,8 @@ public: virtual void Func(IteratorHelper& helper, Point& outPoint, QTIsaac& rand) override { - int m = Floor(T(0.5) * helper.In.x / m_Sc); - int n = Floor(T(0.5) * helper.In.y / m_Sc); + int m = int(Floor(T(0.5) * helper.In.x / m_Sc)); + int n = int(Floor(T(0.5) * helper.In.y / m_Sc)); T x = helper.In.x - (m * 2 + 1) * m_Sc; T y = helper.In.y - (n * 2 + 1) * m_Sc; T u = Zeps(Hypot(x, y)); @@ -277,7 +277,7 @@ public: virtual void Func(IteratorHelper& helper, Point& outPoint, QTIsaac& rand) override { - int m, n, iters = 0; + intmax_t m, n, iters = 0; T x, y, u; do @@ -293,7 +293,7 @@ public: if (++iters > 10) break; } - while ((DiscreteNoise2(int(m + m_Seed), n) > m_Dens) || (u > (T(0.3) + T(0.7) * DiscreteNoise2(m + 10, n + 3)) * m_Sc)); + while ((DiscreteNoise2(int(m + m_Seed), int(n)) > m_Dens) || (u > (T(0.3) + T(0.7) * DiscreteNoise2(int(m + 10), int(n + 3))) * m_Sc)); helper.Out.x = m_Weight * (x + (m * 2 + 1) * m_Sc); helper.Out.y = m_Weight * (y + (n * 2 + 1) * m_Sc); @@ -406,14 +406,14 @@ public: Trans(m_X, m_Y, helper.In.x, helper.In.y, &ux, &uy); - int m = Floor(T(0.5) * ux / m_Sc); - int n = Floor(T(0.5) * uy / m_Sc); + intmax_t m = Floor(T(0.5) * ux / m_Sc); + intmax_t n = Floor(T(0.5) * uy / m_Sc); x = ux - (m * 2 + 1) * m_Sc; y = uy - (n * 2 + 1) * m_Sc; u = Hypot(x, y); - if ((DiscreteNoise2(int(m + m_Seed), n) > m_Dens) || (u > (T(0.3) + T(0.7) * DiscreteNoise2(m + 10, n + 3)) * m_Sc)) + if ((DiscreteNoise2(int(m + m_Seed), int(n)) > m_Dens) || (u > (T(0.3) + T(0.7) * DiscreteNoise2(int(m + 10), int(n + 3))) * m_Sc)) { ux = ux; uy = uy; @@ -548,7 +548,7 @@ private: void CircleR(T* ux, T* vy, QTIsaac& rand) { - int m, n, iters = 0; + intmax_t m, n, iters = 0; T x, y, alpha, u; do @@ -558,14 +558,14 @@ private: m = Floor(T(0.5) * x / m_Sc); n = Floor(T(0.5) * y / m_Sc); alpha = M_2PI * rand.Frand01(); - u = T(0.3) + T(0.7) * DiscreteNoise2(m + 10, n + 3); + u = T(0.3) + T(0.7) * DiscreteNoise2(int(m + 10), int(n + 3)); x = u * cos(alpha); y = u * sin(alpha); if (++iters > 10) break; } - while (DiscreteNoise2(int(m + m_Seed), n) > m_Dens); + while (DiscreteNoise2(int(m + m_Seed), int(n)) > m_Dens); *ux = x + (m * 2 + 1) * m_Sc; *vy = y + (n * 2 + 1) * m_Sc; @@ -2704,7 +2704,7 @@ public: virtual void Func(IteratorHelper& helper, Point& outPoint, QTIsaac& rand) override { - int m, n; + intmax_t m, n; T alpha, beta, offsetAl, offsetBe, offsetGa, x, y; //Transfer to trilinear coordinates, normalized to real distances from triangle sides. diff --git a/Source/Ember/VariationsDC.h b/Source/Ember/VariationsDC.h index 81070c9..a7464e4 100644 --- a/Source/Ember/VariationsDC.h +++ b/Source/Ember/VariationsDC.h @@ -878,7 +878,7 @@ public: << "\t\t inside = 1;\n" << "\t\t}\n" << "\n" - << "\t\tif (" << zeroEdges << " && !inside)\n" + << "\t\tif (" << zeroEdges << " != 0.0 && !inside)\n" << "\t\t{\n" << "\t\t u = v = 0;\n" << "\t\t}\n" diff --git a/Source/Ember/XmlToEmber.h b/Source/Ember/XmlToEmber.h index 87afb53..1a4d6d4 100644 --- a/Source/Ember/XmlToEmber.h +++ b/Source/Ember/XmlToEmber.h @@ -315,7 +315,7 @@ public: //An adjustment of +/- 360 degrees is made until this is true. if (emberSize > 1) { - for (uint i = 1; i < emberSize; i++) + for (size_t i = 1; i < emberSize; i++) { //Only do this adjustment if not in compat mode.. if (embers[i - 1].m_AffineInterp != INTERP_COMPAT && embers[i - 1].m_AffineInterp != INTERP_OLDER) @@ -362,79 +362,22 @@ public: } /// - /// Convert the string to a floating point value and return a bool indicating success. + /// Thin wrapper around converting the string to a numeric value and return a bool indicating success. /// See error report for errors. /// /// The string to convert /// The converted value /// True if success, else false. - bool Atof(const char* str, T& val) + template + bool Aton(const char* str, valT& val) { bool b = true; - char* endp; const char* loc = __FUNCTION__; + std::istringstream istr(str); - //Reset errno. - errno = 0;//Note that this is not thread-safe. + istr >> val; - //Convert the string using strtod(). - val = T(strtod(str, &endp)); - - //Check errno & return string. - if (endp != str + strlen(str)) - { - m_ErrorReport.push_back(string(loc) + " : Error converting " + string(str) + ", extra chars"); - b = false; - } - - if (errno) - { - m_ErrorReport.push_back(string(loc) + " : Error converting " + string(str)); - b = false; - } - - return b; - } - - /// - /// Thin wrapper around Atoi(). - /// See error report for errors. - /// - /// The string to convert - /// The converted uinteger value - /// True if success, else false. - bool Atoi(const char* str, uint& val) - { - return Atoi(str, reinterpret_cast(val)); - } - - /// - /// Convert the string to an uinteger value and return a bool indicating success. - /// See error report for errors. - /// - /// The string to convert - /// The converted uinteger value - /// True if success, else false. - bool Atoi(const char* str, int& val) - { - bool b = true; - char* endp; - const char* loc = __FUNCTION__; - - //Reset errno. - errno = 0;//Note that this is not thread-safe. - - //Convert the string using strtod(). - val = strtol(str, &endp, 10); - - //Check errno & return string. - if (endp != str + strlen(str)) - { - m_ErrorReport.push_back(string(loc) + " : Error converting " + string(str) + ", extra chars"); - b = false; - } - - if (errno) + if (istr.bad() || istr.fail()) { m_ErrorReport.push_back(string(loc) + " : Error converting " + string(str)); b = false; @@ -443,6 +386,7 @@ public: return b; } + /// /// Convert an integer to a string. /// Just a wrapper around _itoa_s() which wraps the result in a std::string. @@ -554,11 +498,11 @@ private: { bool ret = true; bool fromEmber = false; - uint newLinear = 0; + size_t newLinear = 0; char* attStr; const char* loc = __FUNCTION__; int soloXform = -1; - uint i, j, count, index = 0; + size_t i, count, index = 0; double vals[16]; xmlAttrPtr att, curAtt; xmlNodePtr editNode, childNode, motionNode; @@ -577,38 +521,38 @@ private: attStr = reinterpret_cast(xmlGetProp(emberNode, curAtt->name)); //First parse out simple float reads. - if (ParseAndAssignFloat(curAtt->name, attStr, "time", currentEmber.m_Time, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "scale", currentEmber.m_PixelsPerUnit, ret)) { currentEmber.m_OrigPixPerUnit = currentEmber.m_PixelsPerUnit; } - else if (ParseAndAssignFloat(curAtt->name, attStr, "rotate", currentEmber.m_Rotate, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "zoom", currentEmber.m_Zoom, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "filter", currentEmber.m_SpatialFilterRadius, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "temporal_filter_width", currentEmber.m_TemporalFilterWidth, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "temporal_filter_exp", currentEmber.m_TemporalFilterExp, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "quality", currentEmber.m_Quality, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "brightness", currentEmber.m_Brightness, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "gamma", currentEmber.m_Gamma, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "highlight_power", currentEmber.m_HighlightPower, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "vibrancy", currentEmber.m_Vibrancy, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "estimator_radius", currentEmber.m_MaxRadDE, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "estimator_minimum", currentEmber.m_MinRadDE, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "estimator_curve", currentEmber.m_CurveDE, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "gamma_threshold", currentEmber.m_GammaThresh, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "cam_zpos", currentEmber.m_CamZPos, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "cam_persp", currentEmber.m_CamPerspective, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "cam_perspective", currentEmber.m_CamPerspective, ret)) { }//Apo bug. - else if (ParseAndAssignFloat(curAtt->name, attStr, "cam_yaw", currentEmber.m_CamYaw, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "cam_pitch", currentEmber.m_CamPitch, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "cam_dof", currentEmber.m_CamDepthBlur, ret)) { } + if (ParseAndAssign(curAtt->name, attStr, "time", currentEmber.m_Time, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "scale", currentEmber.m_PixelsPerUnit, ret)) { currentEmber.m_OrigPixPerUnit = currentEmber.m_PixelsPerUnit; } + else if (ParseAndAssign(curAtt->name, attStr, "rotate", currentEmber.m_Rotate, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "zoom", currentEmber.m_Zoom, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "filter", currentEmber.m_SpatialFilterRadius, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "temporal_filter_width", currentEmber.m_TemporalFilterWidth, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "temporal_filter_exp", currentEmber.m_TemporalFilterExp, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "quality", currentEmber.m_Quality, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "brightness", currentEmber.m_Brightness, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "gamma", currentEmber.m_Gamma, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "highlight_power", currentEmber.m_HighlightPower, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "vibrancy", currentEmber.m_Vibrancy, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "estimator_radius", currentEmber.m_MaxRadDE, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "estimator_minimum", currentEmber.m_MinRadDE, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "estimator_curve", currentEmber.m_CurveDE, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "gamma_threshold", currentEmber.m_GammaThresh, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "cam_zpos", currentEmber.m_CamZPos, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "cam_persp", currentEmber.m_CamPerspective, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "cam_perspective", currentEmber.m_CamPerspective, ret)) { }//Apo bug. + else if (ParseAndAssign(curAtt->name, attStr, "cam_yaw", currentEmber.m_CamYaw, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "cam_pitch", currentEmber.m_CamPitch, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "cam_dof", currentEmber.m_CamDepthBlur, ret)) { } //Parse simple int reads. - else if (ParseAndAssignInt(curAtt->name, attStr, "palette", currentEmber.m_Palette.m_Index, ret)) { } - else if (ParseAndAssignInt(curAtt->name, attStr, "oversample", currentEmber.m_Supersample , ret)) { } - else if (ParseAndAssignInt(curAtt->name, attStr, "supersample", currentEmber.m_Supersample , ret)) { } - else if (ParseAndAssignInt(curAtt->name, attStr, "temporal_samples", currentEmber.m_TemporalSamples, ret)) { } - else if (ParseAndAssignInt(curAtt->name, attStr, "sub_batch_size", currentEmber.m_SubBatchSize , ret)) { } - else if (ParseAndAssignInt(curAtt->name, attStr, "fuse", currentEmber.m_FuseCount , ret)) { } - else if (ParseAndAssignInt(curAtt->name, attStr, "soloxform", soloXform , ret)) { } - else if (ParseAndAssignInt(curAtt->name, attStr, "new_linear", newLinear , ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "palette", currentEmber.m_Palette.m_Index, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "oversample", currentEmber.m_Supersample , ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "supersample", currentEmber.m_Supersample , ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "temporal_samples", currentEmber.m_TemporalSamples, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "sub_batch_size", currentEmber.m_SubBatchSize , ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "fuse", currentEmber.m_FuseCount , ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "soloxform", soloXform , ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "new_linear", newLinear , ret)) { } //Parse more complicated reads that have multiple possible values. else if (!Compare(curAtt->name, "interpolation")) @@ -718,7 +662,7 @@ private: for (i = 0; i < 4; i++) { - for (j = 0; j < 4; j++) + for (glm::length_t j = 0; j < 4; j++) { ss >> currentEmber.m_Curves.m_Points[i][j].x; ss >> currentEmber.m_Curves.m_Points[i][j].y; @@ -759,7 +703,7 @@ private: if (!Compare(curAtt->name, "index")) { - Atoi(attStr, index); + Aton(attStr, index); } else if(!Compare(curAtt->name, "rgb")) { @@ -819,7 +763,7 @@ private: if (!Compare(curAtt->name, "count")) { - Atoi(attStr, count); + Aton(attStr, count); } else if (!Compare(curAtt->name, "data")) { @@ -842,11 +786,6 @@ private: //Make sure BOTH are not specified, otherwise either are ok. int numColors = 0; int numBytes = 0; - int index0, index1; - T hue0, hue1; - T blend = 0.5; - index0 = index1 = -1; - hue0 = hue1 = 0.0; //Loop through the attributes of the palette element. att = childNode->properties; @@ -863,7 +802,7 @@ private: if (!Compare(curAtt->name, "count")) { - Atoi(attStr, numColors); + Aton(attStr, numColors); } else if (!Compare(curAtt->name, "format")) { @@ -915,7 +854,7 @@ private: if (!Compare(curAtt->name, "kind")) { - Atoi(attStr, symKind); + Aton(attStr, symKind); } else { @@ -1016,8 +955,8 @@ private: { attStr = reinterpret_cast(xmlGetProp(childNode, curAtt->name)); - if (ParseAndAssignFloat(curAtt->name, attStr, "motion_frequency", motion.m_MotionFreq, ret)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "motion_offset", motion.m_MotionOffset, ret)) { } + if (ParseAndAssign(curAtt->name, attStr, "motion_frequency", motion.m_MotionFreq, ret)) { } + else if (ParseAndAssign(curAtt->name, attStr, "motion_offset", motion.m_MotionOffset, ret)) { } else if (!Compare(curAtt->name, "motion_function")) { string func(attStr); @@ -1135,7 +1074,7 @@ private: bool r = false; T val = 0.0; - if (Atof(attStr, val)) + if (Aton(attStr, val)) { motion.m_MotionParams.push_back(MotionParam(param, val)); r = true; @@ -1160,7 +1099,7 @@ private: bool success = true; char* attStr; const char* loc = __FUNCTION__; - uint j; + size_t j; T temp; double a, b, c, d, e, f; double vals[10]; @@ -1180,13 +1119,13 @@ private: attStr = reinterpret_cast(xmlGetProp(childNode, curAtt->name)); //First parse out simple float reads. - if (ParseAndAssignFloat(curAtt->name, attStr, "weight", xform.m_Weight, success)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "color_speed", xform.m_ColorSpeed, success)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "animate", xform.m_Animate, success)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "opacity", xform.m_Opacity, success)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "var_color", xform.m_DirectColor, success)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "motion_frequency", xform.m_MotionFreq, success)) { } - else if (ParseAndAssignFloat(curAtt->name, attStr, "motion_offset", xform.m_MotionOffset, success)) { } + if (ParseAndAssign(curAtt->name, attStr, "weight", xform.m_Weight, success)) { } + else if (ParseAndAssign(curAtt->name, attStr, "color_speed", xform.m_ColorSpeed, success)) { } + else if (ParseAndAssign(curAtt->name, attStr, "animate", xform.m_Animate, success)) { } + else if (ParseAndAssign(curAtt->name, attStr, "opacity", xform.m_Opacity, success)) { } + else if (ParseAndAssign(curAtt->name, attStr, "var_color", xform.m_DirectColor, success)) { } + else if (ParseAndAssign(curAtt->name, attStr, "motion_frequency", xform.m_MotionFreq, success)) { } + else if (ParseAndAssign(curAtt->name, attStr, "motion_offset", xform.m_MotionOffset, success)) { } //Parse more complicated reads that have multiple possible values. else if (!Compare(curAtt->name, "name")) @@ -1198,7 +1137,7 @@ private: { //Deprecated, set both color_speed and animate to this value. //Huh? Either set it or not? - Atof(attStr, temp); + Aton(attStr, temp); xform.m_ColorSpeed = (1 - temp) / 2; xform.m_Animate = T(temp > 0 ? 0 : 1); } @@ -1297,7 +1236,7 @@ private: { auto varCopy = var->Copy(); - Atof(attStr, varCopy->m_Weight); + Aton(attStr, varCopy->m_Weight); xform.AddVariation(varCopy); } //else @@ -1321,7 +1260,7 @@ private: for (j = 0; j < xform.TotalVariationCount(); j++) xform.GetVariation(j)->m_Weight = 0; - if (Atof(attStr, temp)) + if (Aton(attStr, temp)) { uint iTemp = static_cast(temp); @@ -1349,7 +1288,7 @@ private: { attStr = reinterpret_cast(xmlGetProp(childNode, curAtt->name)); - if (Atof(attStr, temp)) + if (Aton(attStr, temp)) { for (j = 0; j < xform.TotalVariationCount(); j++) xform.GetVariation(j)->m_Weight = temp; @@ -1366,7 +1305,7 @@ private: } //Now that all xforms have been parsed, go through and try to find params for the parametric variations. - for (uint i = 0; i < xform.TotalVariationCount(); i++) + for (size_t i = 0; i < xform.TotalVariationCount(); i++) { if (ParametricVariation* parVar = dynamic_cast*>(xform.GetVariation(i))) { @@ -1381,7 +1320,7 @@ private: T val = 0; attStr = CX(xmlGetProp(childNode, curAtt->name)); - if (Atof(attStr, val)) + if (Aton(attStr, val)) { parVar->SetParamVal(name, val); } @@ -1476,14 +1415,14 @@ private: /// The number of colors present /// The number of channels in each color /// True if there were no errors, else false. - bool ParseHexColors(char* colstr, Ember& ember, int numColors, int chan) + bool ParseHexColors(char* colstr, Ember& ember, size_t numColors, intmax_t chan) { - int colorIndex = 0; - int colorCount = 0; + size_t colorIndex = 0; + size_t colorCount = 0; uint r, g, b, a; int ret; char tmps[2]; - int skip = static_cast(abs(chan)); + size_t skip = std::abs(chan); bool ok = true; const char* loc = __FUNCTION__; @@ -1539,47 +1478,25 @@ private: } /// - /// Wrapper to parse a floating point Xml value and convert it to float. + /// Wrapper to parse a numeric Xml string value and convert it. /// /// The xml tag to parse /// The name of the Xml attribute /// The name of the Xml tag /// The parsed value - /// Bitwise ANDed with true if name matched str and the call to Atof() succeeded, else false. Used for keeping a running value between successive calls. - /// True if the tag was matched, else false - bool ParseAndAssignFloat(const xmlChar* name, const char* attStr, const char* str, T& val, bool& b) + /// Bitwise ANDed with true if name matched str and the conversion succeeded, else false. Used for keeping a running value between successive calls. + /// True if the tag was matched and the conversion succeeded, else false + template + bool ParseAndAssign(const xmlChar* name, const char* attStr, const char* str, valT& val, bool& b) { bool ret = false; if (!Compare(name, str)) { - b &= Atof(attStr, val); - ret = true;//Means the strcmp() was right, but doesn't necessarily mean the conversion went ok. - } + istringstream istr(attStr); - return ret; - } - - /// - /// Wrapper to parse an int Xml string value and convert it to an int. - /// - /// The xml tag to parse - /// The name of the Xml attribute - /// The name of the Xml tag - /// The parsed value - /// Bitwise ANDed with true if name matched str and the call to Atoi() succeeded, else false. Used for keeping a running value between successive calls. - /// True if the tag was matched, else false - template - bool ParseAndAssignInt(const xmlChar* name, const char* attStr, const char* str, intT& val, bool& b) - { - bool ret = false; - T fval = 0; - - if (!Compare(name, str)) - { - b &= Atof(attStr, fval); - val = static_cast(fval); - ret = true;//Means the strcmp() was right, but doesn't necessarily mean the conversion went ok. + istr >> val; + ret = !istr.bad() && !istr.fail();//Means the Compare() was right, and the conversion succeeded. } return ret; diff --git a/Source/EmberAnimate/EmberAnimate.cpp b/Source/EmberAnimate/EmberAnimate.cpp index e36ed1b..c03fc44 100644 --- a/Source/EmberAnimate/EmberAnimate.cpp +++ b/Source/EmberAnimate/EmberAnimate.cpp @@ -8,10 +8,10 @@ /// /// A populated EmberOptions object which specifies all program options to be used /// True if success, else false. -template +template bool EmberAnimate(EmberOptions& opt) { - OpenCLWrapper wrapper; + OpenCLInfo& info(OpenCLInfo::Instance()); std::cout.imbue(std::locale("")); @@ -21,53 +21,93 @@ bool EmberAnimate(EmberOptions& opt) if (opt.OpenCLInfo()) { cout << "\nOpenCL Info: " << endl; - cout << wrapper.DumpInfo(); + cout << info.DumpInfo(); return true; } //Regular variables. Timing t; bool unsorted = false; - bool startXml = false; - bool finishXml = false; - bool appendXml = false; - uint finalImageIndex = 0; - uint i, channels, ftime, padding; - string s, flameName, filename, inputPath = GetPath(opt.Input()); - ostringstream os; + uint channels, padding; + size_t i; + string inputPath = GetPath(opt.Input()); vector> embers; - EmberStats stats; - EmberReport emberReport; - EmberImageComments comments; - Ember centerEmber; XmlToEmber parser; EmberToXml emberToXml; - vector finalImages[2]; - std::thread writeThread; - unique_ptr> progress(new RenderProgress()); - unique_ptr> renderer(CreateRenderer(opt.EmberCL() ? OPENCL_RENDERER : CPU_RENDERER, opt.Platform(), opt.Device(), false, 0, emberReport)); - vector errorReport = emberReport.ErrorReport(); + EmberReport emberReport; + const vector> devices = Devices(opt.Devices()); + std::atomic atomfTime; + vector threadVec; + unique_ptr> progress; + vector>> renderers; + vector errorReport; + CriticalSection verboseCs; - if (!errorReport.empty()) - emberReport.DumpErrorReport(); - - if (!renderer.get()) + if (opt.EmberCL()) { - cout << "Renderer creation failed, exiting." << endl; - return false; + renderers = CreateRenderers(OPENCL_RENDERER, devices, false, 0, emberReport); + errorReport = emberReport.ErrorReport(); + + if (!errorReport.empty()) + emberReport.DumpErrorReport(); + + if (!renderers.size() || renderers.size() != devices.size()) + { + cout << "Only created " << renderers.size() << " renderers out of " << devices.size() << " requested, exiting." << endl; + return false; + } + + if (opt.DoProgress()) + { + progress = unique_ptr>(new RenderProgress()); + renderers[0]->Callback(progress.get()); + } + + cout << "Using OpenCL to render." << endl; + + if (opt.Verbose()) + { + for (auto& device : devices) + { + cout << "Platform: " << info.PlatformName(device.first) << endl; + cout << "Device: " << info.DeviceName(device.first, device.second) << endl; + } + } + + if (opt.ThreadCount() > 1) + cout << "Cannot specify threads with OpenCL, using 1 thread." << endl; + + opt.ThreadCount(1); + + for (auto& r : renderers) + r->ThreadCount(opt.ThreadCount(), opt.IsaacSeed() != "" ? opt.IsaacSeed().c_str() : nullptr); + + if (opt.BitsPerChannel() != 8) + { + cout << "Bits per channel cannot be anything other than 8 with OpenCL, setting to 8." << endl; + opt.BitsPerChannel(8); + } } - - if (opt.EmberCL() && renderer->RendererType() != OPENCL_RENDERER)//OpenCL init failed, so fall back to CPU. - opt.EmberCL(false); - - if (!InitPaletteList(opt.PalettePath())) - return false; - - if (!ParseEmberFile(parser, opt.Input(), embers)) - return false; - - if (!opt.EmberCL()) + else { + unique_ptr> tempRenderer(CreateRenderer(CPU_RENDERER, devices, false, 0, emberReport)); + errorReport = emberReport.ErrorReport(); + + if (!errorReport.empty()) + emberReport.DumpErrorReport(); + + if (!tempRenderer.get()) + { + cout << "Renderer creation failed, exiting." << endl; + return false; + } + + if (opt.DoProgress()) + { + progress = unique_ptr>(new RenderProgress()); + tempRenderer->Callback(progress.get()); + } + if (opt.ThreadCount() == 0) { cout << "Using " << Timing::ProcessorCount() << " automatically detected threads." << endl; @@ -78,30 +118,15 @@ bool EmberAnimate(EmberOptions& opt) cout << "Using " << opt.ThreadCount() << " manually specified threads." << endl; } - renderer->ThreadCount(opt.ThreadCount(), opt.IsaacSeed() != "" ? opt.IsaacSeed().c_str() : nullptr); + tempRenderer->ThreadCount(opt.ThreadCount(), opt.IsaacSeed() != "" ? opt.IsaacSeed().c_str() : nullptr); + renderers.push_back(std::move(tempRenderer)); } - else - { - cout << "Using OpenCL to render." << endl; - if (opt.Verbose()) - { - cout << "Platform: " << wrapper.PlatformName(opt.Platform()) << endl; - cout << "Device: " << wrapper.DeviceName(opt.Platform(), opt.Device()) << endl; - } + if (!InitPaletteList(opt.PalettePath())) + return false; - if (opt.ThreadCount() > 1) - cout << "Cannot specify threads with OpenCL, using 1 thread." << endl; - - opt.ThreadCount(1); - renderer->ThreadCount(opt.ThreadCount(), opt.IsaacSeed() != "" ? opt.IsaacSeed().c_str() : nullptr); - - if (opt.BitsPerChannel() != 8) - { - cout << "Bits per channel cannot be anything other than 8 with OpenCL, setting to 8." << endl; - opt.BitsPerChannel(8); - } - } + if (!ParseEmberFile(parser, opt.Input(), embers)) + return false; if (opt.Format() != "jpg" && opt.Format() != "png" && @@ -196,7 +221,7 @@ bool EmberAnimate(EmberOptions& opt) //Cast to double in case the value exceeds 2^32. double imageMem = double(channels) * double(embers[i].m_FinalRasW) - * double(embers[i].m_FinalRasH) * double(renderer->BytesPerChannel()); + * double(embers[i].m_FinalRasH) * double(renderers[0]->BytesPerChannel()); double maxMem = pow(2.0, double((sizeof(void*) * 8) - 1)); if (imageMem > maxMem)//Ensure the max amount of memory for a process isn't exceeded. @@ -236,128 +261,168 @@ bool EmberAnimate(EmberOptions& opt) opt.FirstFrame(int(embers[0].m_Time)); if (opt.LastFrame() == UINT_MAX) - opt.LastFrame(ClampGte(uint(embers.back().m_Time - 1), opt.FirstFrame())); + opt.LastFrame(ClampGte(size_t(embers.back().m_Time - 1), opt.FirstFrame())); } if (!opt.Out().empty()) { - appendXml = true; - filename = opt.Out(); - cout << "Single output file " << opt.Out() << " specified for multiple images. They will be all overwritten and only the last image will remain." << endl; + cout << "Single output file " << opt.Out() << " specified for multiple images. They would be all overwritten and only the last image will remain, exiting." << endl; + return false; } //Final setup steps before running. - os.imbue(std::locale("")); - padding = uint(log10((double)embers.size())) + 1; - renderer->SetEmber(embers); - renderer->EarlyClip(opt.EarlyClip()); - renderer->YAxisUp(opt.YAxisUp()); - renderer->LockAccum(opt.LockAccum()); - renderer->InsertPalette(opt.InsertPalette()); - renderer->PixelAspectRatio(T(opt.AspectRatio())); - renderer->Transparency(opt.Transparency()); - renderer->NumChannels(channels); - renderer->BytesPerChannel(opt.BitsPerChannel() / 8); - renderer->Priority((eThreadPriority)Clamp((int)eThreadPriority::LOWEST, (int)eThreadPriority::HIGHEST, opt.Priority())); - renderer->Callback(opt.DoProgress() ? progress.get() : nullptr); + padding = uint(log10(double(embers.size()))) + 1; - std::function saveFunc = [&](uint threadVecIndex) + for (auto& r : renderers) + { + r->SetEmber(embers); + r->EarlyClip(opt.EarlyClip()); + r->YAxisUp(opt.YAxisUp()); + r->LockAccum(opt.LockAccum()); + r->InsertPalette(opt.InsertPalette()); + r->PixelAspectRatio(T(opt.AspectRatio())); + r->Transparency(opt.Transparency()); + r->NumChannels(channels); + r->BytesPerChannel(opt.BitsPerChannel() / 8); + r->Priority(eThreadPriority(Clamp(int(opt.Priority()), int(eThreadPriority::LOWEST), int(eThreadPriority::HIGHEST)))); + } + + std::function&, string, EmberImageComments, size_t, size_t, size_t)> saveFunc = [&](vector& finalImage, + string filename,//These are copies because this will be launched in a thread. + EmberImageComments comments, + size_t w, + size_t h, + size_t chan) { bool writeSuccess = false; - byte* finalImagep = finalImages[threadVecIndex].data(); + byte* finalImagep = finalImage.data(); - if ((opt.Format() == "jpg" || opt.Format() == "bmp") && renderer->NumChannels() == 4) - RgbaToRgb(finalImages[threadVecIndex], finalImages[threadVecIndex], renderer->FinalRasW(), renderer->FinalRasH()); + if ((opt.Format() == "jpg" || opt.Format() == "bmp") && chan == 4) + RgbaToRgb(finalImage, finalImage, w, h); if (opt.Format() == "png") - writeSuccess = WritePng(filename.c_str(), finalImagep, renderer->FinalRasW(), renderer->FinalRasH(), opt.BitsPerChannel() / 8, opt.PngComments(), comments, opt.Id(), opt.Url(), opt.Nick()); + writeSuccess = WritePng(filename.c_str(), finalImagep, w, h, opt.BitsPerChannel() / 8, opt.PngComments(), comments, opt.Id(), opt.Url(), opt.Nick()); else if (opt.Format() == "jpg") - writeSuccess = WriteJpeg(filename.c_str(), finalImagep, renderer->FinalRasW(), renderer->FinalRasH(), opt.JpegQuality(), opt.JpegComments(), comments, opt.Id(), opt.Url(), opt.Nick()); + writeSuccess = WriteJpeg(filename.c_str(), finalImagep, w, h, int(opt.JpegQuality()), opt.JpegComments(), comments, opt.Id(), opt.Url(), opt.Nick()); else if (opt.Format() == "ppm") - writeSuccess = WritePpm(filename.c_str(), finalImagep, renderer->FinalRasW(), renderer->FinalRasH()); + writeSuccess = WritePpm(filename.c_str(), finalImagep, w, h); else if (opt.Format() == "bmp") - writeSuccess = WriteBmp(filename.c_str(), finalImagep, renderer->FinalRasW(), renderer->FinalRasH()); + writeSuccess = WriteBmp(filename.c_str(), finalImagep, w, h); if (!writeSuccess) - cout << "Error writing " << filename << endl;/**/ + cout << "Error writing " << filename << endl; }; + + atomfTime.store(opt.FirstFrame()); - //Begin run. - for (ftime = opt.FirstFrame(); ftime <= opt.LastFrame(); ftime += opt.Dtime()) + std::function iterFunc = [&](size_t index) { - T localTime = T(ftime); + size_t ftime, finalImageIndex = 0; + string filename, flameName; + RendererBase* renderer = renderers[index].get(); + ostringstream fnstream, os; + EmberStats stats; + EmberImageComments comments; + Ember centerEmber; + vector finalImages[2]; + std::thread writeThread; - if ((opt.LastFrame() - opt.FirstFrame()) / opt.Dtime() >= 1) - VerbosePrint("Time = " << ftime << " / " << opt.LastFrame() << " / " << opt.Dtime()); + os.imbue(std::locale("")); - renderer->Reset(); - - if ((renderer->Run(finalImages[finalImageIndex], localTime) != RENDER_OK) || renderer->Aborted() || finalImages[finalImageIndex].empty()) + while (atomfTime.fetch_add(opt.Dtime()), ((ftime = atomfTime.load()) <= opt.LastFrame())) { - cout << "Error: image rendering failed, skipping to next image." << endl; - renderer->DumpErrorReport();//Something went wrong, print errors. - continue; - } + T localTime = T(ftime) - 1; - if (opt.Out().empty()) - { - ostringstream fnstream; + if (opt.Verbose() && ((opt.LastFrame() - opt.FirstFrame()) / opt.Dtime() >= 1)) + { + verboseCs.Enter(); + cout << "Time = " << ftime << " / " << opt.LastFrame() << " / " << opt.Dtime() << endl; + verboseCs.Leave(); + } + + renderer->Reset(); + + if ((renderer->Run(finalImages[finalImageIndex], localTime) != RENDER_OK) || renderer->Aborted() || finalImages[finalImageIndex].empty()) + { + cout << "Error: image rendering failed, skipping to next image." << endl; + renderer->DumpErrorReport();//Something went wrong, print errors. + atomfTime.store(opt.LastFrame() + 1);//Abort all threads if any of them encounter an error. + break; + } fnstream << inputPath << opt.Prefix() << setfill('0') << setw(padding) << ftime << opt.Suffix() << "." << opt.Format(); filename = fnstream.str(); - } + fnstream.str(""); - if (opt.WriteGenome()) - { - flameName = filename.substr(0, filename.find_last_of('.')) + ".flam3"; - VerbosePrint("Writing " + flameName); - Interpolater::Interpolate(embers, localTime, 0, centerEmber);//Get center flame. - - if (appendXml) + if (opt.WriteGenome()) { - startXml = ftime == opt.FirstFrame(); - finishXml = ftime == opt.LastFrame(); + flameName = filename.substr(0, filename.find_last_of('.')) + ".flam3"; + + if (opt.Verbose()) + { + verboseCs.Enter(); + cout << "Writing " << flameName << endl; + verboseCs.Leave(); + } + + Interpolater::Interpolate(embers, localTime, 0, centerEmber);//Get center flame. + emberToXml.Save(flameName, centerEmber, opt.PrintEditDepth(), true, opt.IntPalette(), opt.HexPalette(), true, false, false); + centerEmber.Clear(); } - emberToXml.Save(flameName, centerEmber, opt.PrintEditDepth(), true, opt.IntPalette(), opt.HexPalette(), true, startXml, finishXml); + stats = renderer->Stats(); + comments = renderer->ImageComments(stats, opt.PrintEditDepth(), opt.IntPalette(), opt.HexPalette()); + os.str(""); + size_t iterCount = renderer->TotalIterCount(1); + os << comments.m_NumIters << " / " << iterCount << " (" << std::fixed << std::setprecision(2) << ((double(stats.m_Iters) / double(iterCount)) * 100) << "%)"; + + if (opt.Verbose()) + { + verboseCs.Enter(); + cout << "\nIters ran/requested: " + os.str() << endl; + if (!opt.EmberCL()) cout << "Bad values: " << stats.m_Badvals << endl; + cout << "Render time: " << t.Format(stats.m_RenderMs) << endl; + cout << "Pure iter time: " << t.Format(stats.m_IterMs) << endl; + cout << "Iters/sec: " << size_t(stats.m_Iters / (stats.m_IterMs / 1000.0)) << endl; + cout << "Writing " << filename << endl << endl; + verboseCs.Leave(); + } + + //Run image writing in a thread. Although doing it this way duplicates the final output memory, it saves a lot of time + //when running with OpenCL. Call join() to ensure the previous thread call has completed. + if (writeThread.joinable()) + writeThread.join(); + + auto threadVecIndex = finalImageIndex;//Cache before launching thread. + + if (opt.ThreadedWrite())//Copies are passed of all but the first parameter to saveFunc(), to avoid conflicting with those values changing when starting the render for the next image. + { + writeThread = std::thread(saveFunc, std::ref(finalImages[threadVecIndex]), filename, comments, renderer->FinalRasW(), renderer->FinalRasH(), renderer->NumChannels()); + finalImageIndex ^= 1;//Toggle the index. + } + else + saveFunc(finalImages[threadVecIndex], filename, comments, renderer->FinalRasW(), renderer->FinalRasH(), renderer->NumChannels());//Will always use the first index, thereby not requiring more memory. } - stats = renderer->Stats(); - comments = renderer->ImageComments(stats, opt.PrintEditDepth(), opt.IntPalette(), opt.HexPalette()); - os.str(""); - size_t iterCount = renderer->TotalIterCount(1); - os << comments.m_NumIters << " / " << iterCount << " (" << std::fixed << std::setprecision(2) << ((double(stats.m_Iters) / double(iterCount)) * 100) << "%)"; - - VerbosePrint("\nIters ran/requested: " + os.str()); - VerbosePrint("Bad values: " << stats.m_Badvals); - VerbosePrint("Render time: " + t.Format(stats.m_RenderMs)); - VerbosePrint("Pure iter time: " + t.Format(stats.m_IterMs)); - VerbosePrint("Iters/sec: " << size_t(stats.m_Iters / (stats.m_IterMs / 1000.0)) << endl); - VerbosePrint("Writing " + filename); - - //Run image writing in a thread. Although doing it this way duplicates the final output memory, it saves a lot of time - //when running with OpenCL. Call join() to ensure the previous thread call has completed. - if (writeThread.joinable()) + if (writeThread.joinable())//One final check to make sure all writing is done before exiting this thread. writeThread.join(); + }; - uint threadVecIndex = finalImageIndex;//Cache before launching thread. + threadVec.reserve(renderers.size()); - if (opt.ThreadedWrite()) - writeThread = std::thread(saveFunc, threadVecIndex); - else - saveFunc(threadVecIndex); - - centerEmber.Clear(); - finalImageIndex ^= 1;//Toggle the index. + for (size_t r = 0; r < renderers.size(); r++) + { + threadVec.push_back(std::thread([&](size_t dev) + { + iterFunc(dev); + }, r)); } - if (writeThread.joinable()) - writeThread.join(); + for (auto& th : threadVec) + if (th.joinable()) + th.join(); - VerbosePrint("Done.\n"); - - if (opt.Verbose()) - t.Toc("\nTotal time: ", true); + t.Toc("\nFinished in: ", true); return true; } @@ -387,18 +452,18 @@ int _tmain(int argc, _TCHAR* argv[]) #ifdef DO_DOUBLE if (opt.Bits() == 64) { - b = EmberAnimate(opt); + b = EmberAnimate(opt); } else #endif if (opt.Bits() == 33) { - b = EmberAnimate(opt); + b = EmberAnimate(opt); } else if (opt.Bits() == 32) { cout << "Bits 32/int histogram no longer supported. Using bits == 33 (float)." << endl; - b = EmberAnimate(opt); + b = EmberAnimate(opt); } } diff --git a/Source/EmberCL/EmberCLPch.h b/Source/EmberCL/EmberCLPch.h index 7fd5a95..2de2606 100644 --- a/Source/EmberCL/EmberCLPch.h +++ b/Source/EmberCL/EmberCLPch.h @@ -1,4 +1,6 @@ -#pragma once +#ifdef WIN32 + #pragma once +#endif /// /// Precompiled header file. Place all system includes here with appropriate #defines for different operating systems and compilers. @@ -37,6 +39,7 @@ #include #include +#include #include #include #include diff --git a/Source/EmberCL/IterOpenCLKernelCreator.cpp b/Source/EmberCL/IterOpenCLKernelCreator.cpp index d634152..7de6067 100644 --- a/Source/EmberCL/IterOpenCLKernelCreator.cpp +++ b/Source/EmberCL/IterOpenCLKernelCreator.cpp @@ -15,7 +15,9 @@ IterOpenCLKernelCreator::IterOpenCLKernelCreator() { m_IterEntryPoint = "IterateKernel"; m_ZeroizeEntryPoint = "ZeroizeKernel"; + m_SumHistEntryPoint = "SumHisteKernel"; m_ZeroizeKernel = CreateZeroizeKernelString(); + m_SumHistKernel = CreateSumHistKernelString(); } /// @@ -24,6 +26,8 @@ IterOpenCLKernelCreator::IterOpenCLKernelCreator() template const string& IterOpenCLKernelCreator::ZeroizeKernel() const { return m_ZeroizeKernel; } template const string& IterOpenCLKernelCreator::ZeroizeEntryPoint() const { return m_ZeroizeEntryPoint; } +template const string& IterOpenCLKernelCreator::SumHistKernel() const { return m_SumHistKernel; } +template const string& IterOpenCLKernelCreator::SumHistEntryPoint() const { return m_SumHistEntryPoint; } template const string& IterOpenCLKernelCreator::IterEntryPoint() const { return m_IterEntryPoint; } /// @@ -703,6 +707,30 @@ string IterOpenCLKernelCreator::CreateZeroizeKernelString() return os.str(); } +template +string IterOpenCLKernelCreator::CreateSumHistKernelString() +{ + ostringstream os; + + os << + ConstantDefinesString(typeid(T) == typeid(double)) <= width || GLOBAL_ID_Y >= height)\n" + " return;\n" + "\n" + " dest[(GLOBAL_ID_Y * width) + GLOBAL_ID_X] += source[(GLOBAL_ID_Y * width) + GLOBAL_ID_X];\n"//Can't use INDEX_IN_GRID_2D here because the grid might be larger than the buffer to make even dimensions. + "\n" + " if (clear)\n" + " source[(GLOBAL_ID_Y * width) + GLOBAL_ID_X] = 0;\n" + "\n" + " barrier(CLK_GLOBAL_MEM_FENCE);\n"//Just to be safe. + "}\n" + "\n"; + + return os.str(); +} + /// /// Create the string for 3D projection based on the 3D values of the ember. /// Projection is done on the second point. diff --git a/Source/EmberCL/IterOpenCLKernelCreator.h b/Source/EmberCL/IterOpenCLKernelCreator.h index 9054ce1..fdb5db5 100644 --- a/Source/EmberCL/IterOpenCLKernelCreator.h +++ b/Source/EmberCL/IterOpenCLKernelCreator.h @@ -26,6 +26,8 @@ public: IterOpenCLKernelCreator(); const string& ZeroizeKernel() const; const string& ZeroizeEntryPoint() const; + const string& SumHistKernel() const; + const string& SumHistEntryPoint() const; const string& IterEntryPoint() const; string CreateIterKernelString(Ember& ember, string& parVarDefines, bool lockAccum = false, bool doAccum = true); static void ParVarIndexDefines(Ember& ember, pair>& params, bool doVals = true, bool doString = true); @@ -33,18 +35,21 @@ public: private: string CreateZeroizeKernelString(); + string CreateSumHistKernelString(); string CreateProjectionString(Ember& ember); string m_IterEntryPoint; string m_ZeroizeKernel; string m_ZeroizeEntryPoint; + string m_SumHistKernel; + string m_SumHistEntryPoint; }; #ifdef OPEN_CL_TEST_AREA -typedef void (*KernelFuncPointer) (uint gridWidth, uint gridHeight, uint blockWidth, uint blockHeight, - uint BLOCK_ID_X, uint BLOCK_ID_Y, uint THREAD_ID_X, uint THREAD_ID_Y); +typedef void (*KernelFuncPointer) (size_t gridWidth, size_t gridHeight, size_t blockWidth, size_t blockHeight, + size_t BLOCK_ID_X, size_t BLOCK_ID_Y, size_t THREAD_ID_X, size_t THREAD_ID_Y); -static void OpenCLSim(uint gridWidth, uint gridHeight, uint blockWidth, uint blockHeight, KernelFuncPointer func) +static void OpenCLSim(size_t gridWidth, size_t gridHeight, size_t blockWidth, size_t blockHeight, KernelFuncPointer func) { cout << "OpenCLSim(): " << endl; cout << " Params: " << endl; @@ -53,13 +58,13 @@ static void OpenCLSim(uint gridWidth, uint gridHeight, uint blockWidth, uint blo cout << " blockW: " << blockWidth << endl; cout << " blockH: " << blockHeight << endl; - for (uint i = 0; i < gridHeight; i += blockHeight) + for (size_t i = 0; i < gridHeight; i += blockHeight) { - for (uint j = 0; j < gridWidth; j += blockWidth) + for (size_t j = 0; j < gridWidth; j += blockWidth) { - for (uint k = 0; k < blockHeight; k++) + for (size_t k = 0; k < blockHeight; k++) { - for (uint l = 0; l < blockWidth; l++) + for (size_t l = 0; l < blockWidth; l++) { func(gridWidth, gridHeight, blockWidth, blockHeight, j / blockWidth, i / blockHeight, l, k); } diff --git a/Source/EmberCL/OpenCLInfo.cpp b/Source/EmberCL/OpenCLInfo.cpp new file mode 100644 index 0000000..2e70d44 --- /dev/null +++ b/Source/EmberCL/OpenCLInfo.cpp @@ -0,0 +1,406 @@ +#include "EmberCLPch.h" +#include "OpenCLInfo.h" + +namespace EmberCLns +{ +/// +/// Initialize and return a reference to the one and only OpenCLInfo object. +/// +/// A reference to the only OpenCLInfo object. +OpenCLInfo& OpenCLInfo::Instance() +{ + static OpenCLInfo instance; + + return instance; +} + +/// +/// Initialize the all platforms and devices and keep information about them in lists. +/// +OpenCLInfo::OpenCLInfo() +{ + cl_int err; + vector platforms; + vector> devices; + intmax_t workingPlatformIndex = -1; + + m_Init = false; + cl::Platform::get(&platforms); + devices.resize(platforms.size()); + m_Platforms.reserve(platforms.size()); + m_Devices.reserve(platforms.size()); + m_DeviceNames.reserve(platforms.size()); + m_AllDeviceNames.reserve(platforms.size()); + m_DeviceIndices.reserve(platforms.size()); + + for (size_t i = 0; i < platforms.size(); i++) + platforms[i].getDevices(CL_DEVICE_TYPE_ALL, &devices[i]); + + for (size_t platform = 0; platform < platforms.size(); platform++) + { + bool platformOk = false; + bool deviceOk = false; + cl::Context context; + + if (CreateContext(platforms[platform], context, false))//Platform is ok, now do context. Unshared by default. + { + size_t workingDeviceIndex = 0; + + for (size_t device = 0; device < devices[platform].size(); device++)//Context is ok, now do devices. + { + auto q = cl::CommandQueue(context, devices[platform][device], 0, &err);//At least one GPU device is present, so create a command queue. + + if (CheckCL(err, "cl::CommandQueue()")) + { + if (!platformOk) + { + m_Platforms.push_back(platforms[platform]); + m_PlatformNames.push_back(platforms[platform].getInfo(nullptr) + " " + platforms[platform].getInfo(nullptr) + " " + platforms[platform].getInfo(nullptr)); + workingPlatformIndex++; + platformOk = true; + } + + if (!deviceOk) + { + m_Devices.push_back(vector()); + m_DeviceNames.push_back(vector()); + m_Devices.back().reserve(devices[platform].size()); + m_DeviceNames.back().reserve(devices[platform].size()); + deviceOk = true; + } + + m_Devices.back().push_back(devices[platform][device]); + m_DeviceNames.back().push_back(devices[platform][device].getInfo(nullptr) + " " + devices[platform][device].getInfo(nullptr));// + " " + devices[platform][device].getInfo()); + m_AllDeviceNames.push_back(m_DeviceNames.back().back()); + m_DeviceIndices.push_back(pair(workingPlatformIndex, workingDeviceIndex++)); + m_Init = true;//If at least one platform and device succeeded, OpenCL is ok. It's now ok to begin building and running programs. + } + } + } + } +} + +/// +/// Get a const reference to the vector of available platforms. +/// +/// A const reference to the vector of available platforms +const vector& OpenCLInfo::Platforms() const +{ + return m_Platforms; +} + +/// +/// Get a const reference to the platform name at the specified index. +/// +/// The platform index to get the name of +/// The platform name if found, else empty string +const string& OpenCLInfo::PlatformName(size_t platform) const +{ + static string s; + return platform < m_PlatformNames.size() ? m_PlatformNames[platform] : s; +} + +/// +/// Get a const reference to a vector of all available platform names on the system as a vector of strings. +/// +/// All available platform names on the system as a vector of strings +const vector& OpenCLInfo::PlatformNames() const +{ + return m_PlatformNames; +} + +/// +/// Get a const reference to a vector of vectors of all available devices on the system. +/// Each outer vector is a different platform. +/// +/// All available devices on the system, grouped by platform. +const vector>& OpenCLInfo::Devices() const +{ + return m_Devices; +} + +/// +/// Get a const reference to the device name at the specified index on the platform +/// at the specified index. +/// +/// The platform index of the device +/// The device index +/// The name of the device if found, else empty string +const string& OpenCLInfo::DeviceName(size_t platform, size_t device) const +{ + static string s; + + if (platform < m_Platforms.size() && platform < m_Devices.size()) + if (device < m_Devices[platform].size()) + return m_DeviceNames[platform][device]; + + return s; +} + +/// +/// Get a const reference to a vector of pairs of uints which contain the platform,device +/// indices of all available devices on the system. +/// +/// All available devices on the system as platform,device index pairs +const vector>& OpenCLInfo::DeviceIndices() const +{ + return m_DeviceIndices; +} + +/// +/// Get a const reference to a vector of all available device names on the system as a vector of strings. +/// +/// All available device names on the system as a vector of strings +const vector& OpenCLInfo::AllDeviceNames() const +{ + return m_AllDeviceNames; +} + +/// +/// Get a const reference to a vector of all available device names on the platform +/// at the specified index as a vector of strings. +/// +/// The platform index whose devices names will be returned +/// All available device names on the platform at the specified index as a vector of strings if within range, else empty vector. +const vector& OpenCLInfo::DeviceNames(size_t platform) const +{ + static vector v; + + if (platform < m_DeviceNames.size()) + return m_DeviceNames[platform]; + + return v; +} + +/// +/// Get the total device index at the specified platform and device index. +/// +/// The platform index of the device +/// The device index within the platform +/// The total device index if found, else 0 +size_t OpenCLInfo::TotalDeviceIndex(size_t platform, size_t device) const +{ + size_t index = 0; + pair p{ platform, device }; + + for (size_t i = 0; i < m_DeviceIndices.size(); i++) + { + if (p == m_DeviceIndices[i]) + { + index = i; + break; + } + } + + return index; +} + +/// +/// Create a context that is optionally shared with OpenGL and plact it in the +/// passed in context ref parameter. +/// +/// The platform object to create the context on +/// The context object to store the result in +/// True if shared with OpenGL, else not shared. +/// True if success, else false. +bool OpenCLInfo::CreateContext(const cl::Platform& platform, cl::Context& context, bool shared) +{ + cl_int err; + + if (shared) + { + //Define OS-specific context properties and create the OpenCL context. + #if defined (__APPLE__) || defined(MACOSX) + CGLContextObj kCGLContext = CGLGetCurrentContext(); + CGLShareGroupObj kCGLShareGroup = CGLGetShareGroup(kCGLContext); + cl_context_properties props[] = + { + CL_CONTEXT_PROPERTY_USE_CGL_SHAREGROUP_APPLE, (cl_context_properties)kCGLShareGroup, + 0 + }; + + context = cl::Context(CL_DEVICE_TYPE_GPU, props, nullptr, nullptr, &err);//May need to tinker with this on Mac. + #else + #if defined WIN32 + cl_context_properties props[] = + { + CL_GL_CONTEXT_KHR, (cl_context_properties)wglGetCurrentContext(), + CL_WGL_HDC_KHR, (cl_context_properties)wglGetCurrentDC(), + CL_CONTEXT_PLATFORM, reinterpret_cast((platform)()), + 0 + }; + + context = cl::Context(CL_DEVICE_TYPE_GPU, props, nullptr, nullptr, &err); + #else + cl_context_properties props[] = + { + CL_GL_CONTEXT_KHR, cl_context_properties(glXGetCurrentContext()), + CL_GLX_DISPLAY_KHR, cl_context_properties(glXGetCurrentDisplay()), + CL_CONTEXT_PLATFORM, reinterpret_cast((platform)()), + 0 + }; + + context = cl::Context(CL_DEVICE_TYPE_GPU, props, nullptr, nullptr, &err); + #endif + #endif + } + else + { + cl_context_properties props[3] = + { + CL_CONTEXT_PLATFORM, + reinterpret_cast((platform)()), + 0 + }; + + context = cl::Context(CL_DEVICE_TYPE_ALL, props, nullptr, nullptr, &err); + } + + return CheckCL(err, "cl::Context()"); +} + +/// +/// Return whether at least one device has been found and properly initialized. +/// +/// True if success, else false. +bool OpenCLInfo::Ok() const +{ + return m_Init; +} + +/// +/// Get all information about all platforms and devices. +/// +/// A string with all information about all platforms and devices +string OpenCLInfo::DumpInfo() const +{ + ostringstream os; + vector sizes; + + os.imbue(locale("")); + + for (size_t platform = 0; platform < m_Platforms.size(); platform++) + { + os << "Platform " << platform << ": " << PlatformName(platform) << endl; + + for (size_t device = 0; device < m_Devices[platform].size(); device++) + { + os << "Device " << device << ": " << DeviceName(platform, device) << endl; + os << "CL_DEVICE_OPENCL_C_VERSION: " << GetInfo(platform, device, CL_DEVICE_OPENCL_C_VERSION) << endl; + os << "CL_DEVICE_LOCAL_MEM_SIZE: " << GetInfo(platform, device, CL_DEVICE_LOCAL_MEM_SIZE) << endl; + os << "CL_DEVICE_LOCAL_MEM_TYPE: " << GetInfo(platform, device, CL_DEVICE_LOCAL_MEM_TYPE) << endl; + os << "CL_DEVICE_MAX_COMPUTE_UNITS: " << GetInfo(platform, device, CL_DEVICE_MAX_COMPUTE_UNITS) << endl; + os << "CL_DEVICE_MAX_READ_IMAGE_ARGS: " << GetInfo(platform, device, CL_DEVICE_MAX_READ_IMAGE_ARGS) << endl; + os << "CL_DEVICE_MAX_WRITE_IMAGE_ARGS: " << GetInfo(platform, device, CL_DEVICE_MAX_WRITE_IMAGE_ARGS) << endl; + os << "CL_DEVICE_MAX_MEM_ALLOC_SIZE: " << GetInfo(platform, device, CL_DEVICE_MAX_MEM_ALLOC_SIZE) << endl; + os << "CL_DEVICE_ADDRESS_BITS: " << GetInfo(platform, device, CL_DEVICE_ADDRESS_BITS) << endl; + + os << "CL_DEVICE_GLOBAL_MEM_CACHE_TYPE: " << GetInfo(platform, device, CL_DEVICE_GLOBAL_MEM_CACHE_TYPE) << endl; + os << "CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: " << GetInfo(platform, device, CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE) << endl; + os << "CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: " << GetInfo(platform, device, CL_DEVICE_GLOBAL_MEM_CACHE_SIZE) << endl; + os << "CL_DEVICE_GLOBAL_MEM_SIZE: " << GetInfo(platform, device, CL_DEVICE_GLOBAL_MEM_SIZE) << endl; + os << "CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: " << GetInfo(platform, device, CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE) << endl; + + os << "CL_DEVICE_MAX_CONSTANT_ARGS: " << GetInfo(platform, device, CL_DEVICE_MAX_CONSTANT_ARGS) << endl; + os << "CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: " << GetInfo(platform, device, CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS) << endl; + os << "CL_DEVICE_MAX_WORK_GROUP_SIZE: " << GetInfo(platform, device, CL_DEVICE_MAX_WORK_GROUP_SIZE) << endl; + + sizes = GetInfo>(platform, device, CL_DEVICE_MAX_WORK_ITEM_SIZES); + os << "CL_DEVICE_MAX_WORK_ITEM_SIZES: " << sizes[0] << ", " << sizes[1] << ", " << sizes[2] << endl << endl; + + if (device != m_Devices[platform].size() - 1 && platform != m_Platforms.size() - 1) + os << endl; + } + + os << endl; + } + + return os.str(); +} + +/// +/// Check an OpenCL return value for errors. +/// +/// The error code to inspect +/// A description of where the value was gotten from +/// True if success, else false. +bool OpenCLInfo::CheckCL(cl_int err, const char* name) +{ + if (err != CL_SUCCESS) + { + ostringstream ss; + ss << "ERROR: " << ErrorToStringCL(err) << " in " << name << "." << endl; + m_ErrorReport.push_back(ss.str()); + } + + return err == CL_SUCCESS; +} + +/// +/// Translate an OpenCL error code into a human readable string. +/// +/// The error code to translate +/// A human readable description of the error passed in +string OpenCLInfo::ErrorToStringCL(cl_int err) +{ + switch (err) + { + case CL_SUCCESS: return "Success"; + case CL_DEVICE_NOT_FOUND: return "Device not found"; + case CL_DEVICE_NOT_AVAILABLE: return "Device not available"; + case CL_COMPILER_NOT_AVAILABLE: return "Compiler not available"; + case CL_MEM_OBJECT_ALLOCATION_FAILURE: return "Memory object allocation failure"; + case CL_OUT_OF_RESOURCES: return "Out of resources"; + case CL_OUT_OF_HOST_MEMORY: return "Out of host memory"; + case CL_PROFILING_INFO_NOT_AVAILABLE: return "Profiling information not available"; + case CL_MEM_COPY_OVERLAP: return "Memory copy overlap"; + case CL_IMAGE_FORMAT_MISMATCH: return "Image format mismatch"; + case CL_IMAGE_FORMAT_NOT_SUPPORTED: return "Image format not supported"; + case CL_BUILD_PROGRAM_FAILURE: return "Program build failure"; + case CL_MAP_FAILURE: return "Map failure"; + case CL_MISALIGNED_SUB_BUFFER_OFFSET: return "Misaligned sub buffer offset"; + case CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST: return "Exec status error for events in wait list"; + case CL_INVALID_VALUE: return "Invalid value"; + case CL_INVALID_DEVICE_TYPE: return "Invalid device type"; + case CL_INVALID_PLATFORM: return "Invalid platform"; + case CL_INVALID_DEVICE: return "Invalid device"; + case CL_INVALID_CONTEXT: return "Invalid context"; + case CL_INVALID_QUEUE_PROPERTIES: return "Invalid queue properties"; + case CL_INVALID_COMMAND_QUEUE: return "Invalid command queue"; + case CL_INVALID_HOST_PTR: return "Invalid host pointer"; + case CL_INVALID_MEM_OBJECT: return "Invalid memory object"; + case CL_INVALID_IMAGE_FORMAT_DESCRIPTOR: return "Invalid image format descriptor"; + case CL_INVALID_IMAGE_SIZE: return "Invalid image size"; + case CL_INVALID_SAMPLER: return "Invalid sampler"; + case CL_INVALID_BINARY: return "Invalid binary"; + case CL_INVALID_BUILD_OPTIONS: return "Invalid build options"; + case CL_INVALID_PROGRAM: return "Invalid program"; + case CL_INVALID_PROGRAM_EXECUTABLE: return "Invalid program executable"; + case CL_INVALID_KERNEL_NAME: return "Invalid kernel name"; + case CL_INVALID_KERNEL_DEFINITION: return "Invalid kernel definition"; + case CL_INVALID_KERNEL: return "Invalid kernel"; + case CL_INVALID_ARG_INDEX: return "Invalid argument index"; + case CL_INVALID_ARG_VALUE: return "Invalid argument value"; + case CL_INVALID_ARG_SIZE: return "Invalid argument size"; + case CL_INVALID_KERNEL_ARGS: return "Invalid kernel arguments"; + case CL_INVALID_WORK_DIMENSION: return "Invalid work dimension"; + case CL_INVALID_WORK_GROUP_SIZE: return "Invalid work group size"; + case CL_INVALID_WORK_ITEM_SIZE: return "Invalid work item size"; + case CL_INVALID_GLOBAL_OFFSET: return "Invalid global offset"; + case CL_INVALID_EVENT_WAIT_LIST: return "Invalid event wait list"; + case CL_INVALID_EVENT: return "Invalid event"; + case CL_INVALID_OPERATION: return "Invalid operation"; + case CL_INVALID_GL_OBJECT: return "Invalid OpenGL object"; + case CL_INVALID_BUFFER_SIZE: return "Invalid buffer size"; + case CL_INVALID_MIP_LEVEL: return "Invalid mip-map level"; + case CL_INVALID_GLOBAL_WORK_SIZE: return "Invalid global work size"; + case CL_INVALID_PROPERTY: return "Invalid property"; + default: + { + ostringstream ss; + ss << " " << err; + return ss.str(); + } + } +} +} diff --git a/Source/EmberCL/OpenCLInfo.h b/Source/EmberCL/OpenCLInfo.h new file mode 100644 index 0000000..eceedb0 --- /dev/null +++ b/Source/EmberCL/OpenCLInfo.h @@ -0,0 +1,69 @@ +#pragma once + +#include "EmberCLPch.h" + +/// +/// OpenCLInfo class. +/// + +namespace EmberCLns +{ +/// +/// Keeps information about all valid OpenCL devices on this system. +/// Devices which do not successfully create a test command queue are not +/// added to the list. +/// The pattern is singleton, so there is only one instance per program, +/// retreivable by reference via the Instance() function. +/// This class derives from EmberReport, so the caller is able +/// to retrieve a text dump of error information if any errors occur. +/// +class EMBERCL_API OpenCLInfo : public EmberReport +{ +public: + static OpenCLInfo& Instance(); + const vector& Platforms() const; + const string& PlatformName(size_t platform) const; + const vector& PlatformNames() const; + const vector>& Devices() const; + const string& DeviceName(size_t platform, size_t device) const; + const vector>& DeviceIndices() const; + const vector& AllDeviceNames() const; + const vector& DeviceNames(size_t platform) const; + size_t TotalDeviceIndex(size_t platform, size_t device) const; + string DumpInfo() const; + bool Ok() const; + bool CreateContext(const cl::Platform& platform, cl::Context& context, bool shared); + bool CheckCL(cl_int err, const char* name); + string ErrorToStringCL(cl_int err); + + /// + /// Get device information for the specified field. + /// Template argument expected to be cl_ulong, cl_uint or cl_int; + /// + /// The index platform of the platform to use + /// The index device of the device to use + /// The device field/feature to query + /// The value of the field + template + T GetInfo(size_t platform, size_t device, cl_device_info name) const + { + T val = T(); + + if (platform < m_Devices.size() && device < m_Devices[platform].size()) + m_Devices[platform][device].getInfo(name, &val); + + return val; + } + +private: + OpenCLInfo(); + + bool m_Init; + vector m_Platforms; + vector> m_Devices; + vector m_PlatformNames; + vector> m_DeviceNames; + vector> m_DeviceIndices; + vector m_AllDeviceNames; +}; +} diff --git a/Source/EmberCL/OpenCLWrapper.cpp b/Source/EmberCL/OpenCLWrapper.cpp index a73d96d..d816b82 100644 --- a/Source/EmberCL/OpenCLWrapper.cpp +++ b/Source/EmberCL/OpenCLWrapper.cpp @@ -5,33 +5,23 @@ namespace EmberCLns { /// /// Constructor that sets everything to an uninitialized state. -/// No OpenCL setup is done here, the caller must explicitly do it. +/// No OpenCL setup is done here other than what's done in the +/// global OpenCLInfo object. The caller must explicitly do it. /// OpenCLWrapper::OpenCLWrapper() + : m_Info(OpenCLInfo::Instance()) { m_Init = false; m_Shared = false; m_PlatformIndex = 0; m_DeviceIndex = 0; m_LocalMemSize = 0; - cl::Platform::get(&m_Platforms); - m_Devices.resize(m_Platforms.size()); - for (size_t i = 0; i < m_Platforms.size(); i++) - m_Platforms[i].getDevices(CL_DEVICE_TYPE_ALL, &m_Devices[i]); -} - -/// -/// Determine if OpenCL is available on the system. -/// -/// True if any OpenCL platform and at least 1 device within that platform exists on the system, else false. -bool OpenCLWrapper::CheckOpenCL() -{ - for (size_t i = 0; i < m_Platforms.size(); i++) - for (size_t j = 0; j < m_Devices[i].size(); j++) - return true; - - return false; + //Pre-allocate some space to avoid temporary copying. + m_Programs.reserve(4); + m_Buffers.reserve(4); + m_Images.reserve(4); + m_GLImages.reserve(4); } /// @@ -42,35 +32,40 @@ bool OpenCLWrapper::CheckOpenCL() /// The index device of the device to use /// True if shared with OpenGL, else false. /// True if success, else false. -bool OpenCLWrapper::Init(uint platform, uint device, bool shared) +bool OpenCLWrapper::Init(size_t platformIndex, size_t deviceIndex, bool shared) { cl_int err; + auto& platforms = m_Info.Platforms(); + auto& devices = m_Info.Devices(); m_Init = false; m_ErrorReport.clear(); - if (m_Platforms.size() > 0) + if (m_Info.Ok()) { - if (platform < m_Platforms.size() && platform < m_Devices.size()) + if (platformIndex < platforms.size() && platformIndex < devices.size()) { - m_PlatformIndex = platform;//Platform is ok, now do context. + cl::Context context; - if (CreateContext(shared)) + if (m_Info.CreateContext(platforms[platformIndex], context, shared))//Platform index is within range, now do context. { - //Context is ok, now do device. - if (device < m_Devices[m_PlatformIndex].size()) + if (deviceIndex < devices[platformIndex].size())//Context is ok, now do device. { - //At least one GPU device is present, so create a command queue. - m_Queue = cl::CommandQueue(m_Context, m_Devices[m_PlatformIndex][device], 0, &err); + auto q = cl::CommandQueue(context, devices[platformIndex][deviceIndex], 0, &err);//At least one GPU device is present, so create a command queue. - if (CheckCL(err, "cl::CommandQueue()")) + if (m_Info.CheckCL(err, "cl::CommandQueue()"))//Everything was successful so assign temporaries to members. { - m_DeviceIndex = device; - m_Platform = m_Platforms[m_PlatformIndex]; - m_Device = m_Devices[m_PlatformIndex][device]; + m_Platform = platforms[platformIndex]; + m_Device = devices[platformIndex][deviceIndex]; + m_Context = context; + m_Queue = q; + m_PlatformIndex = platformIndex; + m_DeviceIndex = deviceIndex; m_DeviceVec.clear(); m_DeviceVec.push_back(m_Device); - m_LocalMemSize = uint(GetInfo(m_PlatformIndex, m_DeviceIndex, CL_DEVICE_LOCAL_MEM_SIZE)); + m_LocalMemSize = size_t(m_Info.GetInfo(m_PlatformIndex, m_DeviceIndex, CL_DEVICE_LOCAL_MEM_SIZE)); + m_GlobalMemSize = size_t(m_Info.GetInfo(m_PlatformIndex, m_DeviceIndex, CL_DEVICE_GLOBAL_MEM_SIZE)); + m_MaxAllocSize = size_t(m_Info.GetInfo(m_PlatformIndex, m_DeviceIndex, CL_DEVICE_MAX_MEM_ALLOC_SIZE)); m_Shared = shared; m_Init = true;//Command queue is ok, it's now ok to begin building and running programs. } @@ -96,11 +91,11 @@ bool OpenCLWrapper::AddProgram(const string& name, const string& program, const if (CreateSPK(name, program, entryPoint, spk, doublePrecision)) { - for (auto& program : m_Programs) + for (auto& p : m_Programs) { - if (name == program.m_Name) + if (name == p.m_Name) { - program = spk; + p = spk; return true; } } @@ -144,7 +139,7 @@ bool OpenCLWrapper::AddBuffer(const string& name, size_t size, cl_mem_flags flag { cl::Buffer buff(m_Context, flags, size, nullptr, &err); - if (!CheckCL(err, "cl::Buffer()")) + if (!m_Info.CheckCL(err, "cl::Buffer()")) return false; NamedBuffer nb(buff, name); @@ -157,7 +152,7 @@ bool OpenCLWrapper::AddBuffer(const string& name, size_t size, cl_mem_flags flag cl::Buffer buff(m_Context, flags, size, nullptr, &err);//Create the new buffer. - if (!CheckCL(err, "cl::Buffer()")) + if (!m_Info.CheckCL(err, "cl::Buffer()")) return false; NamedBuffer nb(buff, name);//Make a named buffer out of the new buffer. @@ -215,7 +210,7 @@ bool OpenCLWrapper::WriteBuffer(const string& name, void* data, size_t size) /// A pointer to the buffer /// The size in bytes of the buffer /// True if success, else false. -bool OpenCLWrapper::WriteBuffer(uint bufferIndex, void* data, size_t size) +bool OpenCLWrapper::WriteBuffer(size_t bufferIndex, void* data, size_t size) { if (m_Init && (bufferIndex < m_Buffers.size()) && (GetBufferSize(bufferIndex) == size)) { @@ -225,7 +220,7 @@ bool OpenCLWrapper::WriteBuffer(uint bufferIndex, void* data, size_t size) e.wait(); m_Queue.finish(); - if (CheckCL(err, "cl::CommandQueue::enqueueWriteBuffer()")) + if (m_Info.CheckCL(err, "cl::CommandQueue::enqueueWriteBuffer()")) return true; } @@ -253,7 +248,7 @@ bool OpenCLWrapper::ReadBuffer(const string& name, void* data, size_t size) /// A pointer to a buffer to copy the data to /// The size in bytes of the buffer /// True if success, else false. -bool OpenCLWrapper::ReadBuffer(uint bufferIndex, void* data, size_t size) +bool OpenCLWrapper::ReadBuffer(size_t bufferIndex, void* data, size_t size) { if (m_Init && (bufferIndex < m_Buffers.size()) && (GetBufferSize(bufferIndex) == size)) { @@ -263,7 +258,7 @@ bool OpenCLWrapper::ReadBuffer(uint bufferIndex, void* data, size_t size) e.wait(); m_Queue.finish(); - if (CheckCL(err, "cl::CommandQueue::enqueueReadBuffer()")) + if (m_Info.CheckCL(err, "cl::CommandQueue::enqueueReadBuffer()")) return true; } @@ -289,7 +284,7 @@ int OpenCLWrapper::FindBufferIndex(const string& name) /// /// The name of the buffer to search for /// The size of the buffer if found, else 0. -uint OpenCLWrapper::GetBufferSize(const string& name) +size_t OpenCLWrapper::GetBufferSize(const string& name) { int bufferIndex = FindBufferIndex(name); @@ -301,10 +296,10 @@ uint OpenCLWrapper::GetBufferSize(const string& name) /// /// The index of the buffer to get the size of /// The size of the buffer if found, else 0. -uint OpenCLWrapper::GetBufferSize(uint bufferIndex) +size_t OpenCLWrapper::GetBufferSize(size_t bufferIndex) { if (m_Init && (bufferIndex < m_Buffers.size())) - return uint(m_Buffers[bufferIndex].m_Buffer.getInfo(nullptr)); + return m_Buffers[bufferIndex].m_Buffer.getInfo(nullptr); return 0; } @@ -350,12 +345,12 @@ bool OpenCLWrapper::AddAndWriteImage(const string& name, cl_mem_flags flags, con IMAGEGL2D imageGL(m_Context, flags, GL_TEXTURE_2D, 0, texName, &err); NamedImage2DGL namedImageGL(imageGL, name); - if (CheckCL(err, "cl::ImageGL()")) + if (m_Info.CheckCL(err, "cl::ImageGL()")) { m_GLImages.push_back(namedImageGL); if (data) - return WriteImage2D(uint(m_GLImages.size() - 1), true, width, height, row_pitch, data);//OpenGL images/textures require a separate write. + return WriteImage2D(m_GLImages.size() - 1, true, width, height, row_pitch, data);//OpenGL images/textures require a separate write. else return true; } @@ -364,7 +359,7 @@ bool OpenCLWrapper::AddAndWriteImage(const string& name, cl_mem_flags flags, con { NamedImage2D namedImage(cl::Image2D(m_Context, flags, format, width, height, row_pitch, data, &err), name); - if (CheckCL(err, "cl::Image2D()")) + if (m_Info.CheckCL(err, "cl::Image2D()")) { m_Images.push_back(namedImage); return true; @@ -381,7 +376,7 @@ bool OpenCLWrapper::AddAndWriteImage(const string& name, cl_mem_flags flags, con { NamedImage2DGL namedImageGL(IMAGEGL2D(m_Context, flags, GL_TEXTURE_2D, 0, texName, &err), name);//Sizes are different, so create new. - if (CheckCL(err, "cl::ImageGL()")) + if (m_Info.CheckCL(err, "cl::ImageGL()")) { m_GLImages[imageIndex] = namedImageGL; } @@ -403,7 +398,7 @@ bool OpenCLWrapper::AddAndWriteImage(const string& name, cl_mem_flags flags, con NamedImage2D namedImage(cl::Image2D(m_Context, flags, format, width, height, row_pitch, data, &err), name); - if (CheckCL(err, "cl::Image2D()")) + if (m_Info.CheckCL(err, "cl::Image2D()")) { m_Images[imageIndex] = namedImage; return true; @@ -430,7 +425,7 @@ bool OpenCLWrapper::AddAndWriteImage(const string& name, cl_mem_flags flags, con /// The row pitch (usually zero) /// The image data /// True if success, else false. -bool OpenCLWrapper::WriteImage2D(uint index, bool shared, ::size_t width, ::size_t height, ::size_t row_pitch, void* data) +bool OpenCLWrapper::WriteImage2D(size_t index, bool shared, ::size_t width, ::size_t height, ::size_t row_pitch, void* data) { if (m_Init) { @@ -457,7 +452,7 @@ bool OpenCLWrapper::WriteImage2D(uint index, bool shared, ::size_t width, ::size m_Queue.finish(); bool b = EnqueueReleaseGLObjects(imageGL); - return CheckCL(err, "cl::enqueueWriteImage()") && b; + return m_Info.CheckCL(err, "cl::enqueueWriteImage()") && b; } } else if (!shared && index < m_Images.size()) @@ -465,7 +460,7 @@ bool OpenCLWrapper::WriteImage2D(uint index, bool shared, ::size_t width, ::size err = m_Queue.enqueueWriteImage(m_Images[index].m_Image, CL_TRUE, origin, region, row_pitch, 0, data, nullptr, &e); e.wait(); m_Queue.finish(); - return CheckCL(err, "cl::enqueueWriteImage()"); + return m_Info.CheckCL(err, "cl::enqueueWriteImage()"); } } @@ -505,7 +500,7 @@ bool OpenCLWrapper::ReadImage(const string& name, ::size_t width, ::size_t heigh /// True if shared with an OpenGL texture, else false. /// A pointer to a buffer to copy the data to /// True if success, else false. -bool OpenCLWrapper::ReadImage(uint imageIndex, ::size_t width, ::size_t height, ::size_t row_pitch, bool shared, void* data) +bool OpenCLWrapper::ReadImage(size_t imageIndex, ::size_t width, ::size_t height, ::size_t row_pitch, bool shared, void* data) { if (m_Init) { @@ -529,13 +524,13 @@ bool OpenCLWrapper::ReadImage(uint imageIndex, ::size_t width, ::size_t height, { err = m_Queue.enqueueReadImage(m_GLImages[imageIndex].m_Image, true, origin, region, row_pitch, 0, data); bool b = EnqueueReleaseGLObjects(m_GLImages[imageIndex].m_Image); - return CheckCL(err, "cl::enqueueReadImage()") && b; + return m_Info.CheckCL(err, "cl::enqueueReadImage()") && b; } } else if (!shared && imageIndex < m_Images.size()) { err = m_Queue.enqueueReadImage(m_Images[imageIndex].m_Image, true, origin, region, row_pitch, 0, data); - return CheckCL(err, "cl::enqueueReadImage()"); + return m_Info.CheckCL(err, "cl::enqueueReadImage()"); } } @@ -572,7 +567,7 @@ int OpenCLWrapper::FindImageIndex(const string& name, bool shared) /// The name of the image to search for /// True if shared with an OpenGL texture, else false. /// The size of the 2D image if found, else 0. -uint OpenCLWrapper::GetImageSize(const string& name, bool shared) +size_t OpenCLWrapper::GetImageSize(const string& name, bool shared) { int imageIndex = FindImageIndex(name, shared); return GetImageSize(imageIndex, shared); @@ -584,7 +579,7 @@ uint OpenCLWrapper::GetImageSize(const string& name, bool shared) /// Index of the image to search for /// True if shared with an OpenGL texture, else false. /// The size of the 2D image if found, else 0. -uint OpenCLWrapper::GetImageSize(uint imageIndex, bool shared) +size_t OpenCLWrapper::GetImageSize(size_t imageIndex, bool shared) { size_t size = 0; @@ -593,6 +588,7 @@ uint OpenCLWrapper::GetImageSize(uint imageIndex, bool shared) if (shared && imageIndex < m_GLImages.size()) { vector images; + images.push_back(m_GLImages[imageIndex].m_Image); IMAGEGL2D image = m_GLImages[imageIndex].m_Image; @@ -608,7 +604,7 @@ uint OpenCLWrapper::GetImageSize(uint imageIndex, bool shared) } } - return uint(size); + return size; } /// @@ -671,7 +667,7 @@ bool OpenCLWrapper::CreateImage2D(cl::Image2D& image2D, cl_mem_flags flags, cl:: data, &err); - return CheckCL(err, "cl::Image2D()"); + return m_Info.CheckCL(err, "cl::Image2D()"); } return false; @@ -699,7 +695,7 @@ bool OpenCLWrapper::CreateImage2DGL(IMAGEGL2D& image2DGL, cl_mem_flags flags, GL texobj, &err); - return CheckCL(err, "cl::ImageGL()"); + return m_Info.CheckCL(err, "cl::ImageGL()"); } return false; @@ -734,7 +730,7 @@ bool OpenCLWrapper::EnqueueAcquireGLObjects(IMAGEGL2D& image) images.push_back(image); cl_int err = m_Queue.enqueueAcquireGLObjects(&images); m_Queue.finish(); - return CheckCL(err, "cl::CommandQueue::enqueueAcquireGLObjects()"); + return m_Info.CheckCL(err, "cl::CommandQueue::enqueueAcquireGLObjects()"); } return false; @@ -769,7 +765,7 @@ bool OpenCLWrapper::EnqueueReleaseGLObjects(IMAGEGL2D& image) images.push_back(image); cl_int err = m_Queue.enqueueReleaseGLObjects(&images); m_Queue.finish(); - return CheckCL(err, "cl::CommandQueue::enqueueReleaseGLObjects()"); + return m_Info.CheckCL(err, "cl::CommandQueue::enqueueReleaseGLObjects()"); } return false; @@ -787,7 +783,7 @@ bool OpenCLWrapper::EnqueueAcquireGLObjects(const VECTOR_CLASS* memO cl_int err = m_Queue.enqueueAcquireGLObjects(memObjects); m_Queue.finish(); - return CheckCL(err, "cl::CommandQueue::enqueueAcquireGLObjects()"); + return m_Info.CheckCL(err, "cl::CommandQueue::enqueueAcquireGLObjects()"); } return false; @@ -805,7 +801,7 @@ bool OpenCLWrapper::EnqueueReleaseGLObjects(const VECTOR_CLASS* memO cl_int err = m_Queue.enqueueReleaseGLObjects(memObjects); m_Queue.finish(); - return CheckCL(err, "cl::CommandQueue::enqueueReleaseGLObjects()"); + return m_Info.CheckCL(err, "cl::CommandQueue::enqueueReleaseGLObjects()"); } return false; @@ -829,7 +825,7 @@ bool OpenCLWrapper::CreateSampler(cl::Sampler& sampler, cl_bool normalizedCoords filterMode, &err); - return CheckCL(err, "cl::Sampler()"); + return m_Info.CheckCL(err, "cl::Sampler()"); } /// @@ -840,7 +836,7 @@ bool OpenCLWrapper::CreateSampler(cl::Sampler& sampler, cl_bool normalizedCoords /// Index of the argument /// The name of the buffer /// True if success, else false. -bool OpenCLWrapper::SetBufferArg(uint kernelIndex, uint argIndex, const string& name) +bool OpenCLWrapper::SetBufferArg(size_t kernelIndex, cl_uint argIndex, const string& name) { int bufferIndex = OpenCLWrapper::FindBufferIndex(name); @@ -855,7 +851,7 @@ bool OpenCLWrapper::SetBufferArg(uint kernelIndex, uint argIndex, const string& /// Index of the argument /// Index of the buffer /// True if success, else false. -bool OpenCLWrapper::SetBufferArg(uint kernelIndex, uint argIndex, uint bufferIndex) +bool OpenCLWrapper::SetBufferArg(size_t kernelIndex, cl_uint argIndex, size_t bufferIndex) { if (m_Init && bufferIndex < m_Buffers.size()) return SetArg(kernelIndex, argIndex, m_Buffers[bufferIndex].m_Buffer); @@ -872,7 +868,7 @@ bool OpenCLWrapper::SetBufferArg(uint kernelIndex, uint argIndex, uint bufferInd /// True if shared with an OpenGL texture, else false /// The name of the 2D image /// True if success, else false. -bool OpenCLWrapper::SetImageArg(uint kernelIndex, uint argIndex, bool shared, const string& name) +bool OpenCLWrapper::SetImageArg(size_t kernelIndex, cl_uint argIndex, bool shared, const string& name) { if (m_Init) { @@ -892,7 +888,7 @@ bool OpenCLWrapper::SetImageArg(uint kernelIndex, uint argIndex, bool shared, co /// True if shared with an OpenGL texture, else false /// Index of the 2D image /// True if success, else false. -bool OpenCLWrapper::SetImageArg(uint kernelIndex, uint argIndex, bool shared, uint imageIndex) +bool OpenCLWrapper::SetImageArg(size_t kernelIndex, cl_uint argIndex, bool shared, size_t imageIndex) { cl_int err; @@ -901,12 +897,12 @@ bool OpenCLWrapper::SetImageArg(uint kernelIndex, uint argIndex, bool shared, ui if (shared && imageIndex < m_GLImages.size()) { err = m_Programs[kernelIndex].m_Kernel.setArg(argIndex, m_GLImages[imageIndex].m_Image); - return CheckCL(err, "cl::Kernel::setArg()"); + return m_Info.CheckCL(err, "cl::Kernel::setArg()"); } else if (!shared && imageIndex < m_Images.size()) { err = m_Programs[kernelIndex].m_Kernel.setArg(argIndex, m_Images[imageIndex].m_Image); - return CheckCL(err, "cl::Kernel::setArg()"); + return m_Info.CheckCL(err, "cl::Kernel::setArg()"); } } @@ -938,8 +934,8 @@ int OpenCLWrapper::FindKernelIndex(const string& name) /// Height of each block /// Depth of each block /// True if success, else false. -bool OpenCLWrapper::RunKernel(uint kernelIndex, uint totalGridWidth, uint totalGridHeight, uint totalGridDepth, - uint blockWidth, uint blockHeight, uint blockDepth) +bool OpenCLWrapper::RunKernel(size_t kernelIndex, size_t totalGridWidth, size_t totalGridHeight, size_t totalGridDepth, + size_t blockWidth, size_t blockHeight, size_t blockDepth) { if (m_Init && kernelIndex < m_Programs.size()) { @@ -953,183 +949,24 @@ bool OpenCLWrapper::RunKernel(uint kernelIndex, uint totalGridWidth, uint totalG e.wait(); m_Queue.finish(); - return CheckCL(err, "cl::CommandQueue::enqueueNDRangeKernel()"); + return m_Info.CheckCL(err, "cl::CommandQueue::enqueueNDRangeKernel()"); } return false; } -/// -/// Get device information for the specified field. -/// Template argument expected to be cl_ulong, cl_uint or cl_int; -/// -/// The device field/feature to query -/// The value of the field -template -T OpenCLWrapper::GetInfo(size_t platform, size_t device, cl_device_info name) const -{ - T val; - - if (platform < m_Devices.size() && device < m_Devices[platform].size()) - m_Devices[platform][device].getInfo(name, &val); - - return val; -} - -/// -/// Get the platform name at the specified index. -/// -/// The platform index to get the name of -/// The platform name if found, else empty string -string OpenCLWrapper::PlatformName(size_t platform) -{ - if (platform < m_Platforms.size()) - return m_Platforms[platform].getInfo(nullptr) + " " + m_Platforms[platform].getInfo(nullptr) + " " + m_Platforms[platform].getInfo(nullptr); - else - return ""; -} - -/// -/// Get all available platform names on the system as a vector of strings. -/// -/// All available platform names on the system as a vector of strings -vector OpenCLWrapper::PlatformNames() -{ - vector platforms; - - platforms.reserve(m_Platforms.size()); - - for (size_t i = 0; i < m_Platforms.size(); i++) - platforms.push_back(PlatformName(i)); - - return platforms; -} - -/// -/// Get the device name at the specified index on the platform -/// at the specified index. -/// -/// The platform index of the device -/// The device index -/// The name of the device if found, else empty string -string OpenCLWrapper::DeviceName(size_t platform, size_t device) -{ - string s; - - if (platform < m_Platforms.size() && platform < m_Devices.size()) - if (device < m_Devices[platform].size()) - s = m_Devices[platform][device].getInfo(nullptr) + " " + m_Devices[platform][device].getInfo(nullptr);// + " " + m_Devices[platform][device].getInfo(); - - return s; -} - -/// -/// Get all available device names on the platform at the specified index as a vector of strings. -/// -/// The platform index of the devices to query -/// All available device names on the platform at the specified index as a vector of strings -vector OpenCLWrapper::DeviceNames(size_t platform) -{ - uint i = 0; - string s; - vector devices; - - do - { - s = DeviceName(platform, i); - - if (s != "") - devices.push_back(s); - - i++; - } while (s != ""); - - return devices; -} - -/// -/// Get all availabe device and platform names as one contiguous string. -/// -/// A string with all available device and platform names -string OpenCLWrapper::DeviceAndPlatformNames() -{ - ostringstream os; - vector deviceNames; - - for (size_t platform = 0; platform < m_Platforms.size(); platform++) - { - os << PlatformName(platform) << endl; - - deviceNames = DeviceNames(platform); - - for (size_t device = 0; device < m_Devices[platform].size(); device++) - os << "\t" << deviceNames[device] << endl; - } - - return os.str(); -} - -/// -/// Get all information about the currently used device. -/// -/// A string with all information about the currently used device -string OpenCLWrapper::DumpInfo() -{ - ostringstream os; - vector sizes; - - os.imbue(std::locale("")); - - for (size_t platform = 0; platform < m_Platforms.size(); platform++) - { - os << "Platform " << platform << ": " << PlatformName(platform) << endl; - - for (size_t device = 0; device < m_Devices[platform].size(); device++) - { - os << "Device " << device << ": " << DeviceName(platform, device) << endl; - os << "CL_DEVICE_OPENCL_C_VERSION: " << GetInfo (platform, device, CL_DEVICE_OPENCL_C_VERSION) << endl; - os << "CL_DEVICE_LOCAL_MEM_SIZE: " << GetInfo(platform, device, CL_DEVICE_LOCAL_MEM_SIZE) << endl; - os << "CL_DEVICE_LOCAL_MEM_TYPE: " << GetInfo (platform, device, CL_DEVICE_LOCAL_MEM_TYPE) << endl; - os << "CL_DEVICE_MAX_COMPUTE_UNITS: " << GetInfo (platform, device, CL_DEVICE_MAX_COMPUTE_UNITS) << endl; - os << "CL_DEVICE_MAX_READ_IMAGE_ARGS: " << GetInfo (platform, device, CL_DEVICE_MAX_READ_IMAGE_ARGS) << endl; - os << "CL_DEVICE_MAX_WRITE_IMAGE_ARGS: " << GetInfo (platform, device, CL_DEVICE_MAX_WRITE_IMAGE_ARGS) << endl; - os << "CL_DEVICE_MAX_MEM_ALLOC_SIZE: " << GetInfo(platform, device, CL_DEVICE_MAX_MEM_ALLOC_SIZE) << endl; - os << "CL_DEVICE_ADDRESS_BITS: " << GetInfo (platform, device, CL_DEVICE_ADDRESS_BITS) << endl; - - os << "CL_DEVICE_GLOBAL_MEM_CACHE_TYPE: " << GetInfo (platform, device, CL_DEVICE_GLOBAL_MEM_CACHE_TYPE) << endl; - os << "CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: " << GetInfo (platform, device, CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE) << endl; - os << "CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: " << GetInfo(platform, device, CL_DEVICE_GLOBAL_MEM_CACHE_SIZE) << endl; - os << "CL_DEVICE_GLOBAL_MEM_SIZE: " << GetInfo(platform, device, CL_DEVICE_GLOBAL_MEM_SIZE) << endl; - os << "CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: " << GetInfo(platform, device, CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE) << endl; - - os << "CL_DEVICE_MAX_CONSTANT_ARGS: " << GetInfo (platform, device, CL_DEVICE_MAX_CONSTANT_ARGS) << endl; - os << "CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: " << GetInfo (platform, device, CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS) << endl; - os << "CL_DEVICE_MAX_WORK_GROUP_SIZE: " << GetInfo<::size_t>(platform, device, CL_DEVICE_MAX_WORK_GROUP_SIZE) << endl; - - sizes = GetInfo>(platform, device, CL_DEVICE_MAX_WORK_ITEM_SIZES); - os << "CL_DEVICE_MAX_WORK_ITEM_SIZES: " << sizes[0] << ", " << sizes[1] << ", " << sizes[2] << endl << endl; - - if (device != m_Devices[platform].size() - 1 && platform != m_Platforms.size() - 1) - os << endl; - } - - os << endl; - } - - return os.str(); -} - /// /// OpenCL properties, getters only. /// bool OpenCLWrapper::Ok() const { return m_Init; } bool OpenCLWrapper::Shared() const { return m_Shared; } -cl::Context OpenCLWrapper::Context() const { return m_Context; } -uint OpenCLWrapper::PlatformIndex() const { return m_PlatformIndex; } -uint OpenCLWrapper::DeviceIndex() const { return m_DeviceIndex; } -size_t OpenCLWrapper::GlobalMemSize() const { return GetInfo(PlatformIndex(), DeviceIndex(), CL_DEVICE_GLOBAL_MEM_SIZE); } -uint OpenCLWrapper::LocalMemSize() const { return m_LocalMemSize; } -size_t OpenCLWrapper::MaxAllocSize() const { return GetInfo(PlatformIndex(), DeviceIndex(), CL_DEVICE_MAX_MEM_ALLOC_SIZE); } +const cl::Context& OpenCLWrapper::Context() const { return m_Context; } +size_t OpenCLWrapper::PlatformIndex() const { return m_PlatformIndex; } +size_t OpenCLWrapper::DeviceIndex() const { return m_DeviceIndex; } +const string& OpenCLWrapper::DeviceName() const { return m_Info.DeviceName(m_PlatformIndex, m_DeviceIndex); } +size_t OpenCLWrapper::LocalMemSize() const { return m_LocalMemSize; } +size_t OpenCLWrapper::GlobalMemSize() const { return m_GlobalMemSize; } +size_t OpenCLWrapper::MaxAllocSize() const { return m_MaxAllocSize; } /// /// Makes the even grid dims. @@ -1138,7 +975,7 @@ size_t OpenCLWrapper::MaxAllocSize() const { return GetInfo(PlatformIn /// The block h. /// The grid w. /// The grid h. -void OpenCLWrapper::MakeEvenGridDims(uint blockW, uint blockH, uint& gridW, uint& gridH) +void OpenCLWrapper::MakeEvenGridDims(size_t blockW, size_t blockH, size_t& gridW, size_t& gridH) { if (gridW % blockW != 0) gridW += (blockW - (gridW % blockW)); @@ -1147,67 +984,6 @@ void OpenCLWrapper::MakeEvenGridDims(uint blockW, uint blockH, uint& gridW, uint gridH += (blockH - (gridH % blockH)); } -/// -/// Create a context that is optionall shared with OpenGL. -/// -/// True if shared with OpenGL, else not shared. -/// True if success, else false. -bool OpenCLWrapper::CreateContext(bool shared) -{ - cl_int err; - - if (shared) - { - //Define OS-specific context properties and create the OpenCL context. - #if defined (__APPLE__) || defined(MACOSX) - CGLContextObj kCGLContext = CGLGetCurrentContext(); - CGLShareGroupObj kCGLShareGroup = CGLGetShareGroup(kCGLContext); - cl_context_properties props[] = - { - CL_CONTEXT_PROPERTY_USE_CGL_SHAREGROUP_APPLE, (cl_context_properties)kCGLShareGroup, - 0 - }; - - m_Context = cl::Context(CL_DEVICE_TYPE_GPU, props, nullptr, nullptr, &err);//May need to tinker with this on Mac. - #else - #if defined WIN32 - cl_context_properties props[] = - { - CL_GL_CONTEXT_KHR, (cl_context_properties)wglGetCurrentContext(), - CL_WGL_HDC_KHR, (cl_context_properties)wglGetCurrentDC(), - CL_CONTEXT_PLATFORM, reinterpret_cast((m_Platforms[m_PlatformIndex])()), - 0 - }; - - m_Context = cl::Context(CL_DEVICE_TYPE_GPU, props, nullptr, nullptr, &err); - #else - cl_context_properties props[] = - { - CL_GL_CONTEXT_KHR, cl_context_properties(glXGetCurrentContext()), - CL_GLX_DISPLAY_KHR, cl_context_properties(glXGetCurrentDisplay()), - CL_CONTEXT_PLATFORM, reinterpret_cast((m_Platforms[m_PlatformIndex])()), - 0 - }; - - m_Context = cl::Context(CL_DEVICE_TYPE_GPU, props, nullptr, nullptr, &err); - #endif - #endif - } - else - { - cl_context_properties props[3] = - { - CL_CONTEXT_PLATFORM, - reinterpret_cast((m_Platforms[m_PlatformIndex])()), - 0 - }; - - m_Context = cl::Context(CL_DEVICE_TYPE_ALL, props, nullptr, nullptr, &err); - } - - return CheckCL(err, "cl::Context()"); -} - /// /// Create an Spk object created by compiling the program arguments passed in. /// @@ -1235,107 +1011,21 @@ bool OpenCLWrapper::CreateSPK(const string& name, const string& program, const s //err = spk.m_Program.build(m_DeviceVec, "-cl-mad-enable -cl-no-signed-zeros -cl-fast-relaxed-math -cl-single-precision-constant");//This can cause some rounding. //err = spk.m_Program.build(m_DeviceVec, "-cl-mad-enable -cl-single-precision-constant"); - if (CheckCL(err, "cl::Program::build()")) + if (m_Info.CheckCL(err, "cl::Program::build()")) { //Building of program is ok, now create kernel with the specified entry point. spk.m_Kernel = cl::Kernel(spk.m_Program, entryPoint.c_str(), &err); - if (CheckCL(err, "cl::Kernel()")) + if (m_Info.CheckCL(err, "cl::Kernel()")) return true;//Everything is ok. } else { for (auto& i : m_DeviceVec) - m_ErrorReport.push_back(spk.m_Program.getBuildInfo(i)); + m_ErrorReport.push_back(spk.m_Program.getBuildInfo(i, nullptr)); } } return false; } - -/// -/// Check an OpenCL return value for errors. -/// -/// The error code to inspect -/// A description of where the value was gotten from -/// True if success, else false. -bool OpenCLWrapper::CheckCL(cl_int err, const char* name) -{ - if (err != CL_SUCCESS) - { - ostringstream ss; - ss << "ERROR: " << ErrorToStringCL(err) << " in " << name << "." << std::endl; - m_ErrorReport.push_back(ss.str()); - } - - return err == CL_SUCCESS; -} - -/// -/// Translate an OpenCL error code into a human readable string. -/// -/// The error code to translate -/// A human readable description of the error passed in -std::string OpenCLWrapper::ErrorToStringCL(cl_int err) -{ - switch (err) - { - case CL_SUCCESS: return "Success"; - case CL_DEVICE_NOT_FOUND: return "Device not found"; - case CL_DEVICE_NOT_AVAILABLE: return "Device not available"; - case CL_COMPILER_NOT_AVAILABLE: return "Compiler not available"; - case CL_MEM_OBJECT_ALLOCATION_FAILURE: return "Memory object allocation failure"; - case CL_OUT_OF_RESOURCES: return "Out of resources"; - case CL_OUT_OF_HOST_MEMORY: return "Out of host memory"; - case CL_PROFILING_INFO_NOT_AVAILABLE: return "Profiling information not available"; - case CL_MEM_COPY_OVERLAP: return "Memory copy overlap"; - case CL_IMAGE_FORMAT_MISMATCH: return "Image format mismatch"; - case CL_IMAGE_FORMAT_NOT_SUPPORTED: return "Image format not supported"; - case CL_BUILD_PROGRAM_FAILURE: return "Program build failure"; - case CL_MAP_FAILURE: return "Map failure"; - case CL_MISALIGNED_SUB_BUFFER_OFFSET: return "Misaligned sub buffer offset"; - case CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST: return "Exec status error for events in wait list"; - case CL_INVALID_VALUE: return "Invalid value"; - case CL_INVALID_DEVICE_TYPE: return "Invalid device type"; - case CL_INVALID_PLATFORM: return "Invalid platform"; - case CL_INVALID_DEVICE: return "Invalid device"; - case CL_INVALID_CONTEXT: return "Invalid context"; - case CL_INVALID_QUEUE_PROPERTIES: return "Invalid queue properties"; - case CL_INVALID_COMMAND_QUEUE: return "Invalid command queue"; - case CL_INVALID_HOST_PTR: return "Invalid host pointer"; - case CL_INVALID_MEM_OBJECT: return "Invalid memory object"; - case CL_INVALID_IMAGE_FORMAT_DESCRIPTOR: return "Invalid image format descriptor"; - case CL_INVALID_IMAGE_SIZE: return "Invalid image size"; - case CL_INVALID_SAMPLER: return "Invalid sampler"; - case CL_INVALID_BINARY: return "Invalid binary"; - case CL_INVALID_BUILD_OPTIONS: return "Invalid build options"; - case CL_INVALID_PROGRAM: return "Invalid program"; - case CL_INVALID_PROGRAM_EXECUTABLE: return "Invalid program executable"; - case CL_INVALID_KERNEL_NAME: return "Invalid kernel name"; - case CL_INVALID_KERNEL_DEFINITION: return "Invalid kernel definition"; - case CL_INVALID_KERNEL: return "Invalid kernel"; - case CL_INVALID_ARG_INDEX: return "Invalid argument index"; - case CL_INVALID_ARG_VALUE: return "Invalid argument value"; - case CL_INVALID_ARG_SIZE: return "Invalid argument size"; - case CL_INVALID_KERNEL_ARGS: return "Invalid kernel arguments"; - case CL_INVALID_WORK_DIMENSION: return "Invalid work dimension"; - case CL_INVALID_WORK_GROUP_SIZE: return "Invalid work group size"; - case CL_INVALID_WORK_ITEM_SIZE: return "Invalid work item size"; - case CL_INVALID_GLOBAL_OFFSET: return "Invalid global offset"; - case CL_INVALID_EVENT_WAIT_LIST: return "Invalid event wait list"; - case CL_INVALID_EVENT: return "Invalid event"; - case CL_INVALID_OPERATION: return "Invalid operation"; - case CL_INVALID_GL_OBJECT: return "Invalid OpenGL object"; - case CL_INVALID_BUFFER_SIZE: return "Invalid buffer size"; - case CL_INVALID_MIP_LEVEL: return "Invalid mip-map level"; - case CL_INVALID_GLOBAL_WORK_SIZE: return "Invalid global work size"; - case CL_INVALID_PROPERTY: return "Invalid property"; - default: - { - ostringstream ss; - ss << " " << err; - return ss.str(); - } - } -} } diff --git a/Source/EmberCL/OpenCLWrapper.h b/Source/EmberCL/OpenCLWrapper.h index 551416b..4a2659a 100644 --- a/Source/EmberCL/OpenCLWrapper.h +++ b/Source/EmberCL/OpenCLWrapper.h @@ -1,6 +1,7 @@ #pragma once #include "EmberCLPch.h" +#include "OpenCLInfo.h" /// /// OpenCLWrapper, Spk, NamedBuffer, NamedImage2D, NamedImage2DGL classes. @@ -91,7 +92,7 @@ public: /// /// Running kernels in OpenCL can require quite a bit of setup, tear down and /// general housekeeping. This class helps shield the user from such hassles. -/// It's main utility is in holding collections of programs, buffers and images +/// Its main utility is in holding collections of programs, buffers and images /// all identified by names. That way, a user can access them as needed without /// having to pollute their code. /// In addition, writing to an existing object by name determines if the object @@ -103,8 +104,7 @@ class EMBERCL_API OpenCLWrapper : public EmberReport { public: OpenCLWrapper(); - bool CheckOpenCL(); - bool Init(uint platform, uint device, bool shared = false); + bool Init(size_t platformIndex, size_t deviceIndex, bool shared = false); //Programs. bool AddProgram(const string& name, const string& program, const string& entryPoint, bool doublePrecision); @@ -114,22 +114,22 @@ public: bool AddBuffer(const string& name, size_t size, cl_mem_flags flags = CL_MEM_READ_WRITE); bool AddAndWriteBuffer(const string& name, void* data, size_t size, cl_mem_flags flags = CL_MEM_READ_WRITE); bool WriteBuffer(const string& name, void* data, size_t size); - bool WriteBuffer(uint bufferIndex, void* data, size_t size); + bool WriteBuffer(size_t bufferIndex, void* data, size_t size); bool ReadBuffer(const string& name, void* data, size_t size); - bool ReadBuffer(uint bufferIndex, void* data, size_t size); + bool ReadBuffer(size_t bufferIndex, void* data, size_t size); int FindBufferIndex(const string& name); - uint GetBufferSize(const string& name); - uint GetBufferSize(uint bufferIndex); + size_t GetBufferSize(const string& name); + size_t GetBufferSize(size_t bufferIndex); void ClearBuffers(); //Images. bool AddAndWriteImage(const string& name, cl_mem_flags flags, const cl::ImageFormat& format, ::size_t width, ::size_t height, ::size_t row_pitch, void* data = NULL, bool shared = false, GLuint texName = 0); - bool WriteImage2D(uint index, bool shared, ::size_t width, ::size_t height, ::size_t row_pitch, void* data); + bool WriteImage2D(size_t index, bool shared, size_t width, size_t height, size_t row_pitch, void* data); bool ReadImage(const string& name, ::size_t width, ::size_t height, ::size_t row_pitch, bool shared, void* data); - bool ReadImage(uint imageIndex, ::size_t width, ::size_t height, ::size_t row_pitch, bool shared, void* data); + bool ReadImage(size_t imageIndex, ::size_t width, ::size_t height, ::size_t row_pitch, bool shared, void* data); int FindImageIndex(const string& name, bool shared); - uint GetImageSize(const string& name, bool shared); - uint GetImageSize(uint imageIndex, bool shared); + size_t GetImageSize(const string& name, bool shared); + size_t GetImageSize(size_t imageIndex, bool shared); bool CompareImageParams(cl::Image& image, cl_mem_flags flags, const cl::ImageFormat& format, ::size_t width, ::size_t height, ::size_t row_pitch); void ClearImages(bool shared); bool CreateImage2D(cl::Image2D& image2D, cl_mem_flags flags, cl::ImageFormat format, ::size_t width, ::size_t height, ::size_t row_pitch = 0, void* data = NULL); @@ -143,10 +143,10 @@ public: bool CreateSampler(cl::Sampler& sampler, cl_bool normalizedCoords, cl_addressing_mode addressingMode, cl_filter_mode filterMode); //Arguments. - bool SetBufferArg(uint kernelIndex, uint argIndex, const string& name); - bool SetBufferArg(uint kernelIndex, uint argIndex, uint bufferIndex); - bool SetImageArg(uint kernelIndex, uint argIndex, bool shared, const string& name); - bool SetImageArg(uint kernelIndex, uint argIndex, bool shared, uint imageIndex); + bool SetBufferArg(size_t kernelIndex, cl_uint argIndex, const string& name); + bool SetBufferArg(size_t kernelIndex, cl_uint argIndex, size_t bufferIndex); + bool SetImageArg(size_t kernelIndex, cl_uint argIndex, bool shared, const string& name); + bool SetImageArg(size_t kernelIndex, cl_uint argIndex, bool shared, size_t imageIndex); /// /// Set an argument in the specified kernel, at the specified argument index. @@ -157,13 +157,13 @@ public: /// The argument value to set /// True if success, else false template - bool SetArg(uint kernelIndex, uint argIndex, T arg) + bool SetArg(size_t kernelIndex, cl_uint argIndex, T arg) { if (m_Init && kernelIndex < m_Programs.size()) { cl_int err = m_Programs[kernelIndex].m_Kernel.setArg(argIndex, arg); - return CheckCL(err, "cl::Kernel::setArg()"); + return m_Info.CheckCL(err, "cl::Kernel::setArg()"); } return false; @@ -171,47 +171,37 @@ public: //Kernels. int FindKernelIndex(const string& name); - bool RunKernel(uint kernelIndex, uint totalGridWidth, uint totalGridHeight, uint totalGridDepth, uint blockWidth, uint blockHeight, uint blockDepth); - - //Info. - template - T GetInfo(size_t platform, size_t device, cl_device_info name) const; - string PlatformName(size_t platform); - vector PlatformNames(); - string DeviceName(size_t platform, size_t device); - vector DeviceNames(size_t platform); - string DeviceAndPlatformNames(); - string DumpInfo(); + bool RunKernel(size_t kernelIndex, size_t totalGridWidth, size_t totalGridHeight, size_t totalGridDepth, size_t blockWidth, size_t blockHeight, size_t blockDepth); //Accessors. bool Ok() const; bool Shared() const; - cl::Context Context() const; - uint PlatformIndex() const; - uint DeviceIndex() const; - uint LocalMemSize() const; + const cl::Context& Context() const; + size_t PlatformIndex() const; + size_t DeviceIndex() const; + const string& DeviceName() const; + size_t TotalDeviceIndex() const; + size_t LocalMemSize() const; size_t GlobalMemSize() const; size_t MaxAllocSize() const; - static void MakeEvenGridDims(uint blockW, uint blockH, uint& gridW, uint& gridH); + static void MakeEvenGridDims(size_t blockW, size_t blockH, size_t& gridW, size_t& gridH); private: - bool CreateContext(bool shared); bool CreateSPK(const string& name, const string& program, const string& entryPoint, Spk& spk, bool doublePrecision); - bool CheckCL(cl_int err, const char* name); - std::string ErrorToStringCL(cl_int err); bool m_Init; bool m_Shared; - uint m_PlatformIndex; - uint m_DeviceIndex; - uint m_LocalMemSize; + size_t m_PlatformIndex; + size_t m_DeviceIndex; + size_t m_LocalMemSize; + size_t m_GlobalMemSize; + size_t m_MaxAllocSize; cl::Platform m_Platform; cl::Context m_Context; cl::Device m_Device; cl::CommandQueue m_Queue; - std::vector m_Platforms; - std::vector> m_Devices; + OpenCLInfo& m_Info; std::vector m_DeviceVec; std::vector m_Programs; std::vector m_Buffers; diff --git a/Source/EmberCL/RendererCL.cpp b/Source/EmberCL/RendererCL.cpp index c7faf68..b9d0df5 100644 --- a/Source/EmberCL/RendererCL.cpp +++ b/Source/EmberCL/RendererCL.cpp @@ -5,44 +5,61 @@ namespace EmberCLns { /// /// Constructor that inintializes various buffer names, block dimensions, image formats -/// and finally initializes OpenCL using the passed in parameters. +/// and finally initializes one or more OpenCL devices using the passed in parameters. +/// When running with multiple devices, the first device is considered the "primary", while +/// others are "secondary". +/// The differences are: +/// -Only the primary device will report progress, however the progress count will contain the combined progress of all devices. +/// -The primary device runs in this thread, while others run on their own threads. +/// -The primary device does density filtering and final accumulation, while the others only iterate. +/// -Upon completion of iteration, the histograms from the secondary devices are: +/// Copied to a temporary host side buffer. +/// Copied from the host side buffer to the primary device's density filtering buffer as a temporary device storage area. +/// Summed from the density filtering buffer, to the primary device's histogram. +/// When this process happens for the last device, the density filtering buffer is cleared since it will be used shortly. /// Kernel creators are set to be non-nvidia by default. Will be properly set in Init(). /// -/// The index platform of the platform to use. Default: 0. -/// The index device of the device to use. Default: 0. +/// A vector of the platform,device index pairs to use. The first device will be the primary and will run non-threaded. /// True if shared with OpenGL, else false. Default: false. /// The texture ID of the shared OpenGL texture if shared. Default: 0. template -RendererCL::RendererCL(uint platform, uint device, bool shared, GLuint outputTexID) +RendererCL::RendererCL(const vector>& devices, bool shared, GLuint outputTexID) : m_IterOpenCLKernelCreator(), m_DEOpenCLKernelCreator(typeid(T) == typeid(double), false), m_FinalAccumOpenCLKernelCreator(typeid(T) == typeid(double)) +{ + Init(); + Init(devices, shared, outputTexID); +} + +/// +/// Initialization of fields, no OpenCL initialization is done here. +template +void RendererCL::Init() { m_Init = false; - m_NVidia = false; m_DoublePrecision = typeid(T) == typeid(double); m_NumChannels = 4; - m_Calls = 0; //Buffer names. - m_EmberBufferName = "Ember"; - m_XformsBufferName = "Xforms"; - m_ParVarsBufferName = "ParVars"; - m_SeedsBufferName = "Seeds"; - m_DistBufferName = "Dist"; - m_CarToRasBufferName = "CarToRas"; - m_DEFilterParamsBufferName = "DEFilterParams"; + m_EmberBufferName = "Ember"; + m_XformsBufferName = "Xforms"; + m_ParVarsBufferName = "ParVars"; + m_SeedsBufferName = "Seeds"; + m_DistBufferName = "Dist"; + m_CarToRasBufferName = "CarToRas"; + m_DEFilterParamsBufferName = "DEFilterParams"; m_SpatialFilterParamsBufferName = "SpatialFilterParams"; - m_DECoefsBufferName = "DECoefs"; - m_DEWidthsBufferName = "DEWidths"; - m_DECoefIndicesBufferName = "DECoefIndices"; - m_SpatialFilterCoefsBufferName = "SpatialFilterCoefs"; - m_CurvesCsaName = "CurvesCsa"; - m_HistBufferName = "Hist"; - m_AccumBufferName = "Accum"; - m_FinalImageName = "Final"; - m_PointsBufferName = "Points"; + m_DECoefsBufferName = "DECoefs"; + m_DEWidthsBufferName = "DEWidths"; + m_DECoefIndicesBufferName = "DECoefIndices"; + m_SpatialFilterCoefsBufferName = "SpatialFilterCoefs"; + m_CurvesCsaName = "CurvesCsa"; + m_HistBufferName = "Hist"; + m_AccumBufferName = "Accum"; + m_FinalImageName = "Final"; + m_PointsBufferName = "Points"; //It's critical that these numbers never change. They are //based on the cuburn model of each kernel launch containing @@ -58,9 +75,6 @@ RendererCL::RendererCL(uint platform, uint device, bool shared, GLui m_PaletteFormat.image_channel_data_type = CL_FLOAT; m_FinalFormat.image_channel_order = CL_RGBA; m_FinalFormat.image_channel_data_type = CL_UNORM_INT8;//Change if this ever supports 2BPC outputs for PNG. - - FillSeeds(); - Init(platform, device, shared, outputTexID);//Init OpenCL upon construction and create programs that will not change. } /// @@ -82,56 +96,101 @@ RendererCL::~RendererCL() /// compilation works. Further compilation will be done later for iteration, density filtering, /// and final accumulation. /// -/// The index platform of the platform to use -/// The index device of the device to use +/// A vector of the platform,device index pairs to use. The first device will be the primary and will run non-threaded. /// True if shared with OpenGL, else false. /// The texture ID of the shared OpenGL texture if shared /// True if success, else false. template -bool RendererCL::Init(uint platform, uint device, bool shared, GLuint outputTexID) +bool RendererCL::Init(const vector>& devices, bool shared, GLuint outputTexID) { - //Timing t; - bool b = true; - m_OutputTexID = outputTexID; + if (devices.empty()) + return false; + + bool b = false; const char* loc = __FUNCTION__; + auto& zeroizeProgram = m_IterOpenCLKernelCreator.ZeroizeKernel(); + auto& sumHistProgram = m_IterOpenCLKernelCreator.SumHistKernel(); + ostringstream os; - if (!m_Wrapper.Ok() || PlatformIndex() != platform || DeviceIndex() != device) + m_Init = false; + m_Devices.clear(); + m_Devices.reserve(devices.size()); + m_OutputTexID = outputTexID; + + for (size_t i = 0; i < devices.size(); i++) { - m_Init = false; - b = m_Wrapper.Init(platform, device, shared); + try + { + unique_ptr cld(new RendererClDevice(typeid(T) == typeid(double), devices[i].first, devices[i].second, i == 0 ? shared : false)); + + if ((b = cld->Init()))//Build a simple program to ensure OpenCL is working right. + { + if (b && !(b = cld->m_Wrapper.AddProgram(m_IterOpenCLKernelCreator.ZeroizeEntryPoint(), zeroizeProgram, m_IterOpenCLKernelCreator.ZeroizeEntryPoint(), m_DoublePrecision))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = cld->m_Wrapper.AddAndWriteImage("Palette", CL_MEM_READ_ONLY, m_PaletteFormat, 256, 1, 0, nullptr))) { this->m_ErrorReport.push_back(loc); } + + if (b) + { + m_Devices.push_back(std::move(cld));//Success, so move to the vector, else it will go out of scope and be deleted. + } + else + { + os << loc << ": failed to init platform " << devices[i].first << ", device " << devices[i].second; + this->m_ErrorReport.push_back(loc); + break; + } + } + } + catch (const std::exception& e) + { + os << loc << ": failed to init platform " << devices[i].first << ", device " << devices[i].second << ": " << e.what(); + this->m_ErrorReport.push_back(os.str()); + } + catch (...) + { + os << loc << ": failed to init platform " << devices[i].first << ", device " << devices[i].second; + this->m_ErrorReport.push_back(os.str()); + } } - if (b && m_Wrapper.Ok() && !m_Init) + if (b && m_Devices.size() == devices.size()) { - m_NVidia = ToLower(m_Wrapper.DeviceAndPlatformNames()).find_first_of("nvidia") != string::npos && m_Wrapper.LocalMemSize() > (32 * 1024); - m_WarpSize = m_NVidia ? 32 : 64; - m_IterOpenCLKernelCreator = IterOpenCLKernelCreator(); - m_DEOpenCLKernelCreator = DEOpenCLKernelCreator(m_DoublePrecision, m_NVidia); + auto& firstWrapper = m_Devices[0]->m_Wrapper; + m_DEOpenCLKernelCreator = DEOpenCLKernelCreator(m_DoublePrecision, m_Devices[0]->Nvidia()); + + //Build a simple program to ensure OpenCL is working right. + if (b && !(b = firstWrapper.AddProgram(m_DEOpenCLKernelCreator.LogScaleAssignDEEntryPoint(), m_DEOpenCLKernelCreator.LogScaleAssignDEKernel(), m_DEOpenCLKernelCreator.LogScaleAssignDEEntryPoint(), m_DoublePrecision))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = firstWrapper.AddProgram(m_IterOpenCLKernelCreator.SumHistEntryPoint(), sumHistProgram, m_IterOpenCLKernelCreator.SumHistEntryPoint(), m_DoublePrecision))) { this->m_ErrorReport.push_back(loc); } - string zeroizeProgram = m_IterOpenCLKernelCreator.ZeroizeKernel(); - string logAssignProgram = m_DEOpenCLKernelCreator.LogScaleAssignDEKernel();//Build a couple of simple programs to ensure OpenCL is working right. + if (b) + { + //This is the maximum box dimension for density filtering which consists of (blockSize * blockSize) + (2 * filterWidth). + //These blocks must be square, and ideally, 32x32. + //Sadly, at the moment, Fermi runs out of resources at that block size because the DE filter function is so complex. + //The next best block size seems to be 24x24. + //AMD is further limited because of less local memory so these have to be 16 on AMD. + m_MaxDEBlockSizeW = m_Devices[0]->Nvidia() ? 24 : 16;//These *must* both be divisible by 8 or else pixels will go missing. + m_MaxDEBlockSizeH = m_Devices[0]->Nvidia() ? 24 : 16; - if (b && !(b = m_Wrapper.AddProgram(m_IterOpenCLKernelCreator.ZeroizeEntryPoint(), zeroizeProgram, m_IterOpenCLKernelCreator.ZeroizeEntryPoint(), m_DoublePrecision))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddProgram(m_DEOpenCLKernelCreator.LogScaleAssignDEEntryPoint(), logAssignProgram, m_DEOpenCLKernelCreator.LogScaleAssignDEEntryPoint(), m_DoublePrecision))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddAndWriteImage("Palette", CL_MEM_READ_ONLY, m_PaletteFormat, 256, 1, 0, nullptr))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_SeedsBufferName, reinterpret_cast(m_Seeds.data()), SizeOf(m_Seeds)))) { this->m_ErrorReport.push_back(loc); } + FillSeeds(); - //This is the maximum box dimension for density filtering which consists of (blockSize * blockSize) + (2 * filterWidth). - //These blocks must be square, and ideally, 32x32. - //Sadly, at the moment, Fermi runs out of resources at that block size because the DE filter function is so complex. - //The next best block size seems to be 24x24. - //AMD is further limited because of less local memory so these have to be 16 on AMD. - m_MaxDEBlockSizeW = m_NVidia ? 24 : 16;//These *must* both be divisible by 16 or else pixels will go missing. - m_MaxDEBlockSizeH = m_NVidia ? 24 : 16; - m_Init = true; - //t.Toc(loc); + for (size_t device = 0; device < m_Devices.size(); device++) + if (b && !(b = m_Devices[device]->m_Wrapper.AddAndWriteBuffer(m_SeedsBufferName, reinterpret_cast(m_Seeds[device].data()), SizeOf(m_Seeds[device])))) { this->m_ErrorReport.push_back(loc); break; } + } + + m_Init = b; + } + else + { + m_Devices.clear(); + os << loc << ": failed to init all devices and platforms."; + this->m_ErrorReport.push_back(os.str()); } - return b; + return m_Init; } /// -/// Set the shared output texture where final accumulation will be written to. +/// Set the shared output texture of the primary device where final accumulation will be written to. /// /// The texture ID of the shared OpenGL texture if shared /// True if success, else false. @@ -141,19 +200,23 @@ bool RendererCL::SetOutputTexture(GLuint outputTexID) bool success = true; const char* loc = __FUNCTION__; - if (!m_Wrapper.Ok()) - return false; - - m_OutputTexID = outputTexID; - EnterResize(); - - if (!m_Wrapper.AddAndWriteImage(m_FinalImageName, CL_MEM_WRITE_ONLY, m_FinalFormat, FinalRasW(), FinalRasH(), 0, nullptr, m_Wrapper.Shared(), m_OutputTexID)) + if (!m_Devices.empty()) { - this->m_ErrorReport.push_back(loc); - success = false; - } + OpenCLWrapper& firstWrapper = m_Devices[0]->m_Wrapper; + m_OutputTexID = outputTexID; + EnterResize(); + + if (!firstWrapper.AddAndWriteImage(m_FinalImageName, CL_MEM_WRITE_ONLY, m_FinalFormat, FinalRasW(), FinalRasH(), 0, nullptr, firstWrapper.Shared(), m_OutputTexID)) + { + this->m_ErrorReport.push_back(loc); + success = false; + } + + LeaveResize(); + } + else + success = false; - LeaveResize(); return success; } @@ -162,38 +225,36 @@ bool RendererCL::SetOutputTexture(GLuint outputTexID) /// //Iters per kernel/block/grid. -template uint RendererCL::IterCountPerKernel() const { return m_IterCountPerKernel; } -template uint RendererCL::IterCountPerBlock() const { return IterCountPerKernel() * IterBlockKernelCount(); } -template uint RendererCL::IterCountPerGrid() const { return IterCountPerKernel() * IterGridKernelCount(); } +template size_t RendererCL::IterCountPerKernel() const { return m_IterCountPerKernel; } +template size_t RendererCL::IterCountPerBlock() const { return IterCountPerKernel() * IterBlockKernelCount(); } +template size_t RendererCL::IterCountPerGrid() const { return IterCountPerKernel() * IterGridKernelCount(); } //Kernels per block. -template uint RendererCL::IterBlockKernelWidth() const { return m_IterBlockWidth; } -template uint RendererCL::IterBlockKernelHeight() const { return m_IterBlockHeight; } -template uint RendererCL::IterBlockKernelCount() const { return IterBlockKernelWidth() * IterBlockKernelHeight(); } +template size_t RendererCL::IterBlockKernelWidth() const { return m_IterBlockWidth; } +template size_t RendererCL::IterBlockKernelHeight() const { return m_IterBlockHeight; } +template size_t RendererCL::IterBlockKernelCount() const { return IterBlockKernelWidth() * IterBlockKernelHeight(); } //Kernels per grid. -template uint RendererCL::IterGridKernelWidth() const { return IterGridBlockWidth() * IterBlockKernelWidth(); } -template uint RendererCL::IterGridKernelHeight() const { return IterGridBlockHeight() * IterBlockKernelHeight(); } -template uint RendererCL::IterGridKernelCount() const { return IterGridKernelWidth() * IterGridKernelHeight(); } +template size_t RendererCL::IterGridKernelWidth() const { return IterGridBlockWidth() * IterBlockKernelWidth(); } +template size_t RendererCL::IterGridKernelHeight() const { return IterGridBlockHeight() * IterBlockKernelHeight(); } +template size_t RendererCL::IterGridKernelCount() const { return IterGridKernelWidth() * IterGridKernelHeight(); } //Blocks per grid. -template uint RendererCL::IterGridBlockWidth() const { return m_IterBlocksWide; } -template uint RendererCL::IterGridBlockHeight() const { return m_IterBlocksHigh; } -template uint RendererCL::IterGridBlockCount() const { return IterGridBlockWidth() * IterGridBlockHeight(); } - -template uint RendererCL::PlatformIndex() { return m_Wrapper.PlatformIndex(); } -template uint RendererCL::DeviceIndex() { return m_Wrapper.DeviceIndex(); } +template size_t RendererCL::IterGridBlockWidth() const { return m_IterBlocksWide; } +template size_t RendererCL::IterGridBlockHeight() const { return m_IterBlocksHigh; } +template size_t RendererCL::IterGridBlockCount() const { return IterGridBlockWidth() * IterGridBlockHeight(); } /// -/// Read the histogram into the host side CPU buffer. -/// Used for debugging. +/// Read the histogram of the specified into the host side CPU buffer. /// +/// The index device of the device whose histogram will be read /// True if success, else false. template -bool RendererCL::ReadHist() +bool RendererCL::ReadHist(size_t device) { - if (Renderer::Alloc())//Allocate the memory to read into. - return m_Wrapper.ReadBuffer(m_HistBufferName, reinterpret_cast(HistBuckets()), SuperSize() * sizeof(v4bT)); + if (device < m_Devices.size()) + if (Renderer::Alloc(true))//Allocate the histogram memory to read into, other buffers not needed. + return m_Devices[device]->m_Wrapper.ReadBuffer(m_HistBufferName, reinterpret_cast(HistBuckets()), SuperSize() * sizeof(v4bT)); return false; } @@ -206,59 +267,89 @@ bool RendererCL::ReadHist() template bool RendererCL::ReadAccum() { - if (Renderer::Alloc())//Allocate the memory to read into. - return m_Wrapper.ReadBuffer(m_AccumBufferName, reinterpret_cast(AccumulatorBuckets()), SuperSize() * sizeof(v4bT)); + if (Renderer::Alloc() && !m_Devices.empty())//Allocate the memory to read into. + return m_Devices[0]->m_Wrapper.ReadBuffer(m_AccumBufferName, reinterpret_cast(AccumulatorBuckets()), SuperSize() * sizeof(v4bT)); return false; } /// -/// Read the temporary points buffer into a host side CPU buffer. +/// Read the temporary points buffer from a device into a host side CPU buffer. /// Used for debugging. /// +/// The index in the device buffer whose points will be read /// The host side buffer to read into /// True if success, else false. template -bool RendererCL::ReadPoints(vector>& vec) +bool RendererCL::ReadPoints(size_t device, vector>& vec) { vec.resize(IterGridKernelCount());//Allocate the memory to read into. - if (vec.size() >= IterGridKernelCount()) - return m_Wrapper.ReadBuffer(m_PointsBufferName, reinterpret_cast(vec.data()), IterGridKernelCount() * sizeof(PointCL)); + if (vec.size() >= IterGridKernelCount() && device < m_Devices.size()) + return m_Devices[device]->m_Wrapper.ReadBuffer(m_PointsBufferName, reinterpret_cast(vec.data()), IterGridKernelCount() * sizeof(PointCL)); return false; } /// -/// Clear the histogram buffer with all zeroes. +/// Clear the histogram buffer for all devices with all zeroes. /// /// True if success, else false. template bool RendererCL::ClearHist() { - return ClearBuffer(m_HistBufferName, uint(SuperRasW()), uint(SuperRasH()), sizeof(v4bT)); + bool b = !m_Devices.empty(); + const char* loc = __FUNCTION__; + + for (size_t i = 0; i < m_Devices.size(); i++) + if (b && !(b = ClearBuffer(i, m_HistBufferName, uint(SuperRasW()), uint(SuperRasH()), sizeof(v4bT)))) { this->m_ErrorReport.push_back(loc); break; } + + return b; } /// -/// Clear the desnity filtering buffer with all zeroes. +/// Clear the histogram buffer for a single device with all zeroes. +/// +/// The index in the device buffer whose histogram will be cleared +/// True if success, else false. +template +bool RendererCL::ClearHist(size_t device) +{ + bool b = device < m_Devices.size(); + const char* loc = __FUNCTION__; + + if (b && !(b = ClearBuffer(device, m_HistBufferName, uint(SuperRasW()), uint(SuperRasH()), sizeof(v4bT)))) { this->m_ErrorReport.push_back(loc); } + + return b; +} + +/// +/// Clear the density filtering buffer with all zeroes. /// /// True if success, else false. template bool RendererCL::ClearAccum() { - return ClearBuffer(m_AccumBufferName, uint(SuperRasW()), uint(SuperRasH()), sizeof(v4bT)); + return ClearBuffer(0, m_AccumBufferName, uint(SuperRasW()), uint(SuperRasH()), sizeof(v4bT)); } /// -/// Write values from a host side CPU buffer into the temporary points buffer. +/// Write values from a host side CPU buffer into the temporary points buffer for the specified device. /// Used for debugging. /// +/// The index in the device buffer whose points will be written to /// The host side buffer whose values to write /// True if success, else false. template -bool RendererCL::WritePoints(vector>& vec) +bool RendererCL::WritePoints(size_t device, vector>& vec) { - return m_Wrapper.WriteBuffer(m_PointsBufferName, reinterpret_cast(vec.data()), SizeOf(vec)); + bool b = false; + const char* loc = __FUNCTION__; + + if (device < m_Devices.size()) + if (!(b = m_Devices[device]->m_Wrapper.WriteBuffer(m_PointsBufferName, reinterpret_cast(vec.data()), SizeOf(vec)))) { this->m_ErrorReport.push_back(loc); } + + return b; } #ifdef TEST_CL @@ -288,7 +379,6 @@ bool RendererCL::WriteRandomPoints() template const string& RendererCL::IterKernel() const { return m_IterKernel; } - /// /// Get the kernel string for the last built density filtering program. /// @@ -308,7 +398,7 @@ const string& RendererCL::FinalAccumKernel() const { return m_FinalA /// /// -/// Read the final image buffer buffer into the host side CPU buffer. +/// Read the final image buffer buffer from the primary device into the host side CPU buffer. /// This must be called before saving the final output image to file. /// /// The host side buffer to read into @@ -316,14 +406,14 @@ const string& RendererCL::FinalAccumKernel() const { return m_FinalA template bool RendererCL::ReadFinal(byte* pixels) { - if (pixels) - return m_Wrapper.ReadImage(m_FinalImageName, FinalRasW(), FinalRasH(), 0, m_Wrapper.Shared(), pixels); + if (pixels && !m_Devices.empty()) + return m_Devices[0]->m_Wrapper.ReadImage(m_FinalImageName, FinalRasW(), FinalRasH(), 0, m_Devices[0]->m_Wrapper.Shared(), pixels); return false; } /// -/// Clear the final image output buffer with all zeroes by copying a host side buffer. +/// Clear the final image output buffer of the primary device with all zeroes by copying a host side buffer. /// Slow, but never used because the final output image is always completely overwritten. /// /// True if success, else false. @@ -331,16 +421,23 @@ template bool RendererCL::ClearFinal() { vector v; - uint index = m_Wrapper.FindImageIndex(m_FinalImageName, m_Wrapper.Shared()); - if (this->PrepFinalAccumVector(v)) + if (!m_Devices.empty()) { - bool b = m_Wrapper.WriteImage2D(index, m_Wrapper.Shared(), FinalRasW(), FinalRasH(), 0, v.data()); + auto& wrapper = m_Devices[0]->m_Wrapper; + uint index = wrapper.FindImageIndex(m_FinalImageName, wrapper.Shared()); - if (!b) - this->m_ErrorReport.push_back(__FUNCTION__); + if (this->PrepFinalAccumVector(v)) + { + bool b = wrapper.WriteImage2D(index, wrapper.Shared(), FinalRasW(), FinalRasH(), 0, v.data()); - return b; + if (!b) + this->m_ErrorReport.push_back(__FUNCTION__); + + return b; + } + else + return false; } else return false; @@ -351,13 +448,13 @@ bool RendererCL::ClearFinal() /// /// -/// The amount of video RAM available on the GPU to render with. +/// The amount of video RAM available on the first GPU to render with. /// /// An unsigned 64-bit integer specifying how much video memory is available template size_t RendererCL::MemoryAvailable() { - return Ok() ? m_Wrapper.GlobalMemSize() : 0ULL; + return Ok() ? m_Devices[0]->m_Wrapper.GlobalMemSize() : 0ULL; } /// @@ -367,7 +464,7 @@ size_t RendererCL::MemoryAvailable() template bool RendererCL::Ok() const { - return m_Init; + return !m_Devices.empty() && m_Init; } /// @@ -382,23 +479,15 @@ void RendererCL::NumChannels(size_t numChannels) } /// -/// Dump the error report for this class as well as the OpenCLWrapper member. -/// -template -void RendererCL::DumpErrorReport() -{ - EmberReport::DumpErrorReport(); - m_Wrapper.DumpErrorReport(); -} - -/// -/// Clear the error report for this class as well as the OpenCLWrapper member. +/// Clear the error report for this class as well as the OpenCLWrapper members of each device. /// template void RendererCL::ClearErrorReport() { EmberReport::ClearErrorReport(); - m_Wrapper.ClearErrorReport(); + + for (auto& device : m_Devices) + device->m_Wrapper.ClearErrorReport(); } /// @@ -426,7 +515,7 @@ size_t RendererCL::ThreadCount() const /// /// Create the density filter in the base class and copy the filter values -/// to the corresponding OpenCL buffers. +/// to the corresponding OpenCL buffers on the primary device. /// /// True if a new filter instance was created, else false. /// True if success, else false. @@ -435,16 +524,17 @@ bool RendererCL::CreateDEFilter(bool& newAlloc) { bool b = true; - if (Renderer::CreateDEFilter(newAlloc)) + if (!m_Devices.empty() && Renderer::CreateDEFilter(newAlloc)) { //Copy coefs and widths here. Convert and copy the other filter params right before calling the filtering kernel. if (newAlloc) { const char* loc = __FUNCTION__; + auto& wrapper = m_Devices[0]->m_Wrapper; - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_DECoefsBufferName, reinterpret_cast(const_cast(m_DensityFilter->Coefs())), m_DensityFilter->CoefsSizeBytes()))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_DEWidthsBufferName, reinterpret_cast(const_cast(m_DensityFilter->Widths())), m_DensityFilter->WidthsSizeBytes()))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_DECoefIndicesBufferName, reinterpret_cast(const_cast(m_DensityFilter->CoefIndices())), m_DensityFilter->CoefsIndicesSizeBytes()))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddAndWriteBuffer(m_DECoefsBufferName, reinterpret_cast(const_cast(m_DensityFilter->Coefs())), m_DensityFilter->CoefsSizeBytes()))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddAndWriteBuffer(m_DEWidthsBufferName, reinterpret_cast(const_cast(m_DensityFilter->Widths())), m_DensityFilter->WidthsSizeBytes()))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddAndWriteBuffer(m_DECoefIndicesBufferName, reinterpret_cast(const_cast(m_DensityFilter->CoefIndices())), m_DensityFilter->CoefsIndicesSizeBytes()))) { this->m_ErrorReport.push_back(loc); } } } else @@ -455,7 +545,7 @@ bool RendererCL::CreateDEFilter(bool& newAlloc) /// /// Create the spatial filter in the base class and copy the filter values -/// to the corresponding OpenCL buffers. +/// to the corresponding OpenCL buffers on the primary device. /// /// True if a new filter instance was created, else false. /// True if success, else false. @@ -464,11 +554,10 @@ bool RendererCL::CreateSpatialFilter(bool& newAlloc) { bool b = true; - if (Renderer::CreateSpatialFilter(newAlloc)) + if (!m_Devices.empty() && Renderer::CreateSpatialFilter(newAlloc)) { if (newAlloc) - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_SpatialFilterCoefsBufferName, reinterpret_cast(m_SpatialFilter->Filter()), m_SpatialFilter->BufferSizeBytes()))) { this->m_ErrorReport.push_back(__FUNCTION__); } - + if (!(b = m_Devices[0]->m_Wrapper.AddAndWriteBuffer(m_SpatialFilterCoefsBufferName, reinterpret_cast(m_SpatialFilter->Filter()), m_SpatialFilter->BufferSizeBytes()))) { this->m_ErrorReport.push_back(__FUNCTION__); } } else b = false; @@ -488,32 +577,41 @@ eRendererType RendererCL::RendererType() const /// /// Concatenate and return the error report for this class and the -/// OpenCLWrapper member as a single string. +/// OpenCLWrapper member of each device as a single string. /// /// The concatenated error report string template string RendererCL::ErrorReportString() { - return EmberReport::ErrorReportString() + m_Wrapper.ErrorReportString(); + auto s = EmberReport::ErrorReportString(); + + for (auto& device : m_Devices) + s += device->m_Wrapper.ErrorReportString(); + + return s; } /// /// Concatenate and return the error report for this class and the -/// OpenCLWrapper member as a vector of strings. +/// OpenCLWrapper member of each device as a vector of strings. /// /// The concatenated error report vector of strings template vector RendererCL::ErrorReport() { auto ours = EmberReport::ErrorReport(); - auto wrappers = m_Wrapper.ErrorReport(); - ours.insert(ours.end(), wrappers.begin(), wrappers.end()); + for (auto& device : m_Devices) + { + auto s = device->m_Wrapper.ErrorReport(); + ours.insert(ours.end(), s.begin(), s.end()); + } + return ours; } /// -/// Set the vector of random contexts. +/// Set the vector of random contexts on every device. /// Call the base, and reset the seeds vector. /// /// The vector of random contexts to assign @@ -524,11 +622,15 @@ bool RendererCL::RandVec(vector>& ran bool b = Renderer::RandVec(randVec); const char* loc = __FUNCTION__; - if (m_Wrapper.Ok()) + if (!m_Devices.empty()) { FillSeeds(); - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_SeedsBufferName, reinterpret_cast(m_Seeds.data()), SizeOf(m_Seeds)))) { this->m_ErrorReport.push_back(loc); } + + for (size_t device = 0; device < m_Devices.size(); device++) + if (b && !(b = m_Devices[device]->m_Wrapper.AddAndWriteBuffer(m_SeedsBufferName, reinterpret_cast(m_Seeds[device].data()), SizeOf(m_Seeds[device])))) { this->m_ErrorReport.push_back(loc); break; } } + else + b = false; return b; } @@ -537,30 +639,16 @@ bool RendererCL::RandVec(vector>& ran /// Protected virtual functions overridden from Renderer. /// -/// -/// Make the final palette used for iteration. -/// This override differs from the base in that it does not use -/// bucketT as the output palette type. This is because OpenCL -/// only supports floats for texture images. -/// -/// The color scalar to multiply the ember's palette by -template -void RendererCL::MakeDmap(T colorScalar) -{ - //m_Ember.m_Palette.MakeDmap(m_DmapCL, colorScalar); - m_Ember.m_Palette.MakeDmap(m_DmapCL, colorScalar); -} - - /// /// Allocate all buffers required for running as well as the final /// 2D image. +/// Note that only iteration-related buffers are allocated on secondary devices. /// /// True if success, else false. template -bool RendererCL::Alloc() +bool RendererCL::Alloc(bool histOnly) { - if (!m_Wrapper.Ok()) + if (!Ok()) return false; EnterResize(); @@ -571,17 +659,23 @@ bool RendererCL::Alloc() size_t accumLength = SuperSize() * sizeof(v4bT); const char* loc = __FUNCTION__; - if (b && !(b = m_Wrapper.AddBuffer(m_EmberBufferName, sizeof(m_EmberCL)))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddBuffer(m_XformsBufferName, SizeOf(m_XformsCL)))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddBuffer(m_ParVarsBufferName, 128 * sizeof(T)))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddBuffer(m_DistBufferName, CHOOSE_XFORM_GRAIN))) { this->m_ErrorReport.push_back(loc); }//Will be resized for xaos. - if (b && !(b = m_Wrapper.AddBuffer(m_CarToRasBufferName, sizeof(m_CarToRasCL)))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddBuffer(m_DEFilterParamsBufferName, sizeof(m_DensityFilterCL)))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddBuffer(m_SpatialFilterParamsBufferName, sizeof(m_SpatialFilterCL)))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddBuffer(m_CurvesCsaName, SizeOf(m_Csa.m_Entries)))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddBuffer(m_HistBufferName, histLength))) { this->m_ErrorReport.push_back(loc); }//Histogram. Will memset to zero later. - if (b && !(b = m_Wrapper.AddBuffer(m_AccumBufferName, accumLength))) { this->m_ErrorReport.push_back(loc); }//Accum buffer. - if (b && !(b = m_Wrapper.AddBuffer(m_PointsBufferName, IterGridKernelCount() * sizeof(PointCL)))) { this->m_ErrorReport.push_back(loc); }//Points between iter calls. + auto& wrapper = m_Devices[0]->m_Wrapper; + + if (b && !(b = wrapper.AddBuffer(m_DEFilterParamsBufferName, sizeof(m_DensityFilterCL)))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddBuffer(m_SpatialFilterParamsBufferName, sizeof(m_SpatialFilterCL)))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddBuffer(m_CurvesCsaName, SizeOf(m_Csa.m_Entries)))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddBuffer(m_AccumBufferName, accumLength))) { this->m_ErrorReport.push_back(loc); }//Accum buffer. + + for (auto& device : m_Devices) + { + if (b && !(b = device->m_Wrapper.AddBuffer(m_EmberBufferName, sizeof(m_EmberCL)))) { this->m_ErrorReport.push_back(loc); break; } + if (b && !(b = device->m_Wrapper.AddBuffer(m_XformsBufferName, SizeOf(m_XformsCL)))) { this->m_ErrorReport.push_back(loc); break; } + if (b && !(b = device->m_Wrapper.AddBuffer(m_ParVarsBufferName, 128 * sizeof(T)))) { this->m_ErrorReport.push_back(loc); break; } + if (b && !(b = device->m_Wrapper.AddBuffer(m_DistBufferName, CHOOSE_XFORM_GRAIN))) { this->m_ErrorReport.push_back(loc); break; }//Will be resized for xaos. + if (b && !(b = device->m_Wrapper.AddBuffer(m_CarToRasBufferName, sizeof(m_CarToRasCL)))) { this->m_ErrorReport.push_back(loc); break; } + if (b && !(b = device->m_Wrapper.AddBuffer(m_HistBufferName, histLength))) { this->m_ErrorReport.push_back(loc); break; }//Histogram. Will memset to zero later. + if (b && !(b = device->m_Wrapper.AddBuffer(m_PointsBufferName, IterGridKernelCount() * sizeof(PointCL)))) { this->m_ErrorReport.push_back(loc); break; }//Points between iter calls. + } LeaveResize(); @@ -591,7 +685,7 @@ bool RendererCL::Alloc() } /// -/// Clear OpenCL histogram and/or density filtering buffers to all zeroes. +/// Clear OpenCL histogram on all devices and/or density filtering buffer on the primary device to all zeroes. /// /// Clear histogram if true, else don't. /// Clear density filtering buffer if true, else don't. @@ -611,7 +705,7 @@ bool RendererCL::ResetBuckets(bool resetHist, bool resetAccum) } /// -/// Perform log scale density filtering. +/// Perform log scale density filtering on the primary device. /// /// True if success and not aborted, else false. template @@ -621,7 +715,7 @@ eRenderStatus RendererCL::LogScaleDensityFilter() } /// -/// Run gaussian density estimation filtering. +/// Run gaussian density estimation filtering on the primary device. /// /// True if success and not aborted, else false. template @@ -652,7 +746,7 @@ eRenderStatus RendererCL::GaussianDensityFilter() } /// -/// Run final accumulation. +/// Run final accumulation on the primary device. /// If pixels is nullptr, the output will remain in the OpenCL 2D image. /// However, if pixels is not nullptr, the output will be copied. This is /// useful when rendering in OpenCL, but saving the output to a file. @@ -665,7 +759,7 @@ eRenderStatus RendererCL::AccumulatorToFinalImage(byte* pixels, size { eRenderStatus status = RunFinalAccum(); - if (status == RENDER_OK && pixels != nullptr && !m_Wrapper.Shared()) + if (status == RENDER_OK && pixels != nullptr && !m_Devices.empty() && !m_Devices[0]->m_Wrapper.Shared()) { pixels += finalOffset; @@ -677,9 +771,10 @@ eRenderStatus RendererCL::AccumulatorToFinalImage(byte* pixels, size } /// -/// Run the iteration algorithm for the specified number of iterations. +/// Run the iteration algorithm for the specified number of iterations, splitting the work +/// across devices. /// This is only called after all other setup has been done. -/// This will recompile the OpenCL program if this ember differs significantly +/// This will recompile the OpenCL program on every device if this ember differs significantly /// from the previous run. /// Note that the bad value count is not recorded when running with OpenCL. If it's /// needed, run on the CPU. @@ -698,45 +793,52 @@ EmberStats RendererCL::Iterate(size_t iterCount, size_t temporalSamp if (m_LastIter == 0) { ConvertEmber(m_Ember, m_EmberCL, m_XformsCL); - ConvertCarToRas(*CoordMap()); + ConvertCarToRas(CoordMap()); - if (b && !(b = m_Wrapper.WriteBuffer(m_EmberBufferName, reinterpret_cast(&m_EmberCL), sizeof(m_EmberCL)))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.WriteBuffer(m_XformsBufferName, reinterpret_cast(m_XformsCL.data()), sizeof(m_XformsCL[0]) * m_XformsCL.size()))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_DistBufferName, reinterpret_cast(const_cast(XformDistributions())), XformDistributionsSize()))) { this->m_ErrorReport.push_back(loc); }//Will be resized for xaos. - if (b && !(b = m_Wrapper.WriteBuffer(m_CarToRasBufferName, reinterpret_cast(&m_CarToRasCL), sizeof(m_CarToRasCL)))) { this->m_ErrorReport.push_back(loc); } + //Rebuilding is expensive, so only do it if it's required. + if (IterOpenCLKernelCreator::IsBuildRequired(m_Ember, m_LastBuiltEmber)) + b = BuildIterProgramForEmber(true); - if (b && !(b = m_Wrapper.AddAndWriteImage("Palette", CL_MEM_READ_ONLY, m_PaletteFormat, m_DmapCL.m_Entries.size(), 1, 0, m_DmapCL.m_Entries.data()))) { this->m_ErrorReport.push_back(loc); } - - if (b) + //Setup buffers on all devices. + for (auto& device : m_Devices) { - IterOpenCLKernelCreator::ParVarIndexDefines(m_Ember, m_Params, true, false);//Always do this to get the values (but no string), regardless of whether a rebuild is necessary. + auto& wrapper = device->m_Wrapper; - //Don't know the size of the parametric varations parameters buffer until the ember is examined. - //So set it up right before the run. - if (!m_Params.second.empty()) + if (b && !(b = wrapper.WriteBuffer (m_EmberBufferName, reinterpret_cast(&m_EmberCL), sizeof(m_EmberCL)))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.WriteBuffer (m_XformsBufferName, reinterpret_cast(m_XformsCL.data()), sizeof(m_XformsCL[0]) * m_XformsCL.size()))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddAndWriteBuffer(m_DistBufferName, reinterpret_cast(const_cast(XformDistributions())), XformDistributionsSize()))) { this->m_ErrorReport.push_back(loc); }//Will be resized for xaos. + if (b && !(b = wrapper.WriteBuffer (m_CarToRasBufferName, reinterpret_cast(&m_CarToRasCL), sizeof(m_CarToRasCL)))) { this->m_ErrorReport.push_back(loc); } + + if (b && !(b = wrapper.AddAndWriteImage("Palette", CL_MEM_READ_ONLY, m_PaletteFormat, m_Dmap.m_Entries.size(), 1, 0, m_Dmap.m_Entries.data()))) { this->m_ErrorReport.push_back(loc); } + + if (b) { - if (!m_Wrapper.AddAndWriteBuffer(m_ParVarsBufferName, m_Params.second.data(), m_Params.second.size() * sizeof(m_Params.second[0]))) + IterOpenCLKernelCreator::ParVarIndexDefines(m_Ember, m_Params, true, false);//Always do this to get the values (but no string), regardless of whether a rebuild is necessary. + + //Don't know the size of the parametric varations parameters buffer until the ember is examined. + //So set it up right before the run. + if (!m_Params.second.empty()) { - m_Abort = true; - this->m_ErrorReport.push_back(loc); - return stats; + if (!wrapper.AddAndWriteBuffer(m_ParVarsBufferName, m_Params.second.data(), m_Params.second.size() * sizeof(m_Params.second[0]))) + { + m_Abort = true; + this->m_ErrorReport.push_back(loc); + return stats; + } } } + else + return stats; } - else - return stats; } - //Rebuilding is expensive, so only do it if it's required. - if (IterOpenCLKernelCreator::IsBuildRequired(m_Ember, m_LastBuiltEmber)) - b = BuildIterProgramForEmber(true); - if (b) { m_IterTimer.Tic();//Tic() here to avoid including build time in iter time measurement. - if (m_LastIter == 0)//Only reset the call count on the beginning of a new render. Do not reset on KEEP_ITERATING. - m_Calls = 0; + if (m_LastIter == 0 && m_ProcessAction != KEEP_ITERATING)//Only reset the call count on the beginning of a new render. Do not reset on KEEP_ITERATING. + for (auto& dev : m_Devices) + dev->m_Calls = 0; b = RunIter(iterCount, temporalSample, stats.m_Iters); @@ -759,38 +861,58 @@ EmberStats RendererCL::Iterate(size_t iterCount, size_t temporalSamp /// /// -/// Build the iteration program for the current ember. +/// Build the iteration program on every device for the current ember. +/// This is parallelized by placing the build for each device on its own thread. /// /// Whether to build in accumulation, only for debugging. Default: true. -/// True if success, else false. +/// True if successful for all devices, else false. template bool RendererCL::BuildIterProgramForEmber(bool doAccum) { //Timing t; + bool b = !m_Devices.empty(); const char* loc = __FUNCTION__; IterOpenCLKernelCreator::ParVarIndexDefines(m_Ember, m_Params, false, true);//Do with string and no vals. m_IterKernel = m_IterOpenCLKernelCreator.CreateIterKernelString(m_Ember, m_Params.first, m_LockAccum, doAccum); //cout << "Building: " << endl << iterProgram << endl; + vector threads; + std::function func = [&](RendererClDevice* dev) + { + if (!dev->m_Wrapper.AddProgram(m_IterOpenCLKernelCreator.IterEntryPoint(), m_IterKernel, m_IterOpenCLKernelCreator.IterEntryPoint(), m_DoublePrecision)) + { + m_ResizeCs.Enter();//Just use the resize CS for lack of a better one. + b = false; + this->m_ErrorReport.push_back(string(loc) + "()\n" + dev->m_Wrapper.DeviceName() + ":\nBuilding the following program failed: \n" + m_IterKernel + "\n"); + m_ResizeCs.Leave(); + } + }; - //A program build is roughly .66s which will detract from the user experience. - //Need to experiment with launching this in a thread/task and returning once it's done.//TODO - if (m_Wrapper.AddProgram(m_IterOpenCLKernelCreator.IterEntryPoint(), m_IterKernel, m_IterOpenCLKernelCreator.IterEntryPoint(), m_DoublePrecision)) + threads.reserve(m_Devices.size() - 1); + + for (size_t device = m_Devices.size() - 1; device >= 0 && device < m_Devices.size(); device--)//Check both extents because size_t will wrap. + { + if (!device)//Secondary devices on their own threads. + threads.push_back(std::thread([&](RendererClDevice* dev) { func(dev); }, m_Devices[device].get())); + else//Primary device on this thread. + func(m_Devices[device].get()); + } + + for (auto& th : threads) + if (th.joinable()) + th.join(); + + if (b) { //t.Toc(__FUNCTION__ " program build"); //cout << string(loc) << "():\nBuilding the following program succeeded: \n" << iterProgram << endl; m_LastBuiltEmber = m_Ember; } - else - { - this->m_ErrorReport.push_back(string(loc) + "():\nBuilding the following program failed: \n" + m_IterKernel + "\n"); - return false; - } - return true; + return b; } /// -/// Run the iteration kernel. +/// Run the iteration kernel on all devices. /// Fusing on the CPU is done once per sub batch, usually 10,000 iters. Here, /// the same fusing frequency is kept, but is done per kernel thread. /// @@ -801,92 +923,101 @@ bool RendererCL::BuildIterProgramForEmber(bool doAccum) template bool RendererCL::RunIter(size_t iterCount, size_t temporalSample, size_t& itersRan) { - Timing t;//, t2(4); - bool b = true; - uint fuse, argIndex; - uint iterCountPerKernel = IterCountPerKernel(); - uint iterCountPerBlock = IterCountPerBlock(); - uint supersize = uint(SuperSize()); - int kernelIndex = m_Wrapper.FindKernelIndex(m_IterOpenCLKernelCreator.IterEntryPoint()); - size_t fuseFreq = Renderer::SubBatchSize() / m_IterCountPerKernel;//Use the base sbs to determine when to fuse. - size_t itersRemaining; - double percent, etaMs; + //Timing t;//, t2(4); + bool success = !m_Devices.empty(); + uint histSuperSize = uint(SuperSize()); + size_t launches = size_t(ceil(double(iterCount) / IterCountPerGrid())); const char* loc = __FUNCTION__; + vector threadVec; + std::atomic atomLaunchesRan; + std::atomic atomItersRan, atomItersRemaining; + size_t adjustedIterCountPerKernel = m_IterCountPerKernel; itersRan = 0; + atomItersRan.store(0); + atomItersRemaining.store(iterCount); + atomLaunchesRan.store(0); + threadVec.reserve(m_Devices.size()); + + //If a very small number of iters is requested, and multiple devices + //are present, then try to spread the launches over the devices. + //Otherwise, only one device would get used. + //Note that this can lead to doing a few more iterations than requested + //due to rounding up to ~32k kernel threads per launch. + if (m_Devices.size() >= launches) + { + launches = m_Devices.size(); + adjustedIterCountPerKernel = size_t(ceil(ceil(double(iterCount) / m_Devices.size()) / IterGridKernelCount())); + } + + size_t fuseFreq = Renderer::SubBatchSize() / adjustedIterCountPerKernel;//Use the base sbs to determine when to fuse. + #ifdef TEST_CL m_Abort = false; #endif - if (kernelIndex != -1) + std::function iterFunc = [&](size_t dev, int kernelIndex) { - //If animating, treat each temporal sample as a newly started render for fusing purposes. - if (temporalSample > 0) - m_Calls = 0; + bool b = true; + auto& wrapper = m_Devices[dev]->m_Wrapper; + intmax_t itersRemaining; - while (b && itersRan < iterCount && !m_Abort) + while (atomLaunchesRan.fetch_add(1), (b && (atomLaunchesRan.load() <= launches) && ((itersRemaining = atomItersRemaining.load()) > 0) && !m_Abort)) { - argIndex = 0; + cl_uint argIndex = 0; #ifdef TEST_CL fuse = 0; #else - //fuse = 100; - //fuse = ((m_Calls % fuseFreq) == 0 ? (EarlyClip() ? 100u : 15u) : 0u); - fuse = uint((m_Calls % fuseFreq) == 0u ? FuseCount() : 0u); - //fuse = ((m_Calls % 4) == 0 ? 100u : 0u); + uint fuse = uint((m_Devices[dev]->m_Calls % fuseFreq) == 0u ? FuseCount() : 0u); #endif - itersRemaining = iterCount - itersRan; - uint gridW = uint(std::min(ceil(double(itersRemaining) / double(iterCountPerBlock)), double(IterGridBlockWidth()))); - uint gridH = uint(std::min(ceil(double(itersRemaining) / double(gridW * iterCountPerBlock)), double(IterGridBlockHeight()))); - uint iterCountThisLaunch = iterCountPerBlock * gridW * gridH; - //Similar to what's done in the base class. - //The number of iters per thread must be adjusted if they've requested less iters than is normally ran in a block (256 * 256). - if (iterCountThisLaunch > iterCount) - { - iterCountPerKernel = uint(ceil(double(iterCount) / double(gridW * gridH * IterBlockKernelCount()))); - iterCountThisLaunch = iterCountPerKernel * (gridW * gridH * IterBlockKernelCount()); - } + //The number of iters per thread must be adjusted if they've requested less iters than is normally ran in a grid (256 * 256 * 64 * 2 = 32,768). + uint iterCountPerKernel = std::min(uint(adjustedIterCountPerKernel), uint(ceil(double(itersRemaining) / IterGridKernelCount()))); + size_t iterCountThisLaunch = iterCountPerKernel * IterGridKernelWidth() * IterGridKernelHeight(); + //cout << "itersRemaining " << itersRemaining << ", iterCountPerKernel " << iterCountPerKernel << ", iterCountThisLaunch " << iterCountThisLaunch << endl; - if (b && !(b = m_Wrapper.SetArg (kernelIndex, argIndex++, iterCountPerKernel))) { this->m_ErrorReport.push_back(loc); }//Number of iters for each thread to run. - if (b && !(b = m_Wrapper.SetArg (kernelIndex, argIndex++, fuse))) { this->m_ErrorReport.push_back(loc); }//Number of iters to fuse. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_SeedsBufferName))) { this->m_ErrorReport.push_back(loc); }//Seeds. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_EmberBufferName))) { this->m_ErrorReport.push_back(loc); }//Ember. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_XformsBufferName))) { this->m_ErrorReport.push_back(loc); }//Xforms. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_ParVarsBufferName))) { this->m_ErrorReport.push_back(loc); }//Parametric variation parameters. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_DistBufferName))) { this->m_ErrorReport.push_back(loc); }//Xform distributions. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_CarToRasBufferName))) { this->m_ErrorReport.push_back(loc); }//Coordinate converter. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_HistBufferName))) { this->m_ErrorReport.push_back(loc); }//Histogram. - if (b && !(b = m_Wrapper.SetArg (kernelIndex, argIndex++, supersize))) { this->m_ErrorReport.push_back(loc); }//Histogram size. - if (b && !(b = m_Wrapper.SetImageArg (kernelIndex, argIndex++, false, "Palette"))) { this->m_ErrorReport.push_back(loc); }//Palette. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_PointsBufferName))) { this->m_ErrorReport.push_back(loc); }//Random start points. + if (b && !(b = wrapper.SetArg (kernelIndex, argIndex++, iterCountPerKernel))) { this->m_ErrorReport.push_back(loc); }//Number of iters for each thread to run. + if (b && !(b = wrapper.SetArg (kernelIndex, argIndex++, fuse))) { this->m_ErrorReport.push_back(loc); }//Number of iters to fuse. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_SeedsBufferName))) { this->m_ErrorReport.push_back(loc); }//Seeds. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_EmberBufferName))) { this->m_ErrorReport.push_back(loc); }//Ember. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_XformsBufferName))) { this->m_ErrorReport.push_back(loc); }//Xforms. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_ParVarsBufferName))) { this->m_ErrorReport.push_back(loc); }//Parametric variation parameters. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_DistBufferName))) { this->m_ErrorReport.push_back(loc); }//Xform distributions. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_CarToRasBufferName))) { this->m_ErrorReport.push_back(loc); }//Coordinate converter. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_HistBufferName))) { this->m_ErrorReport.push_back(loc); }//Histogram. + if (b && !(b = wrapper.SetArg (kernelIndex, argIndex++, histSuperSize))) { this->m_ErrorReport.push_back(loc); }//Histogram size. + if (b && !(b = wrapper.SetImageArg (kernelIndex, argIndex++, false, "Palette"))) { this->m_ErrorReport.push_back(loc); }//Palette. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_PointsBufferName))) { this->m_ErrorReport.push_back(loc); }//Random start points. - if (b && !(b = m_Wrapper.RunKernel(kernelIndex, - gridW * IterBlockKernelWidth(),//Total grid dims. - gridH * IterBlockKernelHeight(), - 1, - IterBlockKernelWidth(),//Individual block dims. - IterBlockKernelHeight(), - 1))) + if (b && !(b = wrapper.RunKernel(kernelIndex, + IterGridKernelWidth(),//Total grid dims. + IterGridKernelHeight(), + 1, + IterBlockKernelWidth(),//Individual block dims. + IterBlockKernelHeight(), + 1))) { + success = false; m_Abort = true; this->m_ErrorReport.push_back(loc); + atomLaunchesRan.fetch_sub(1); break; } - itersRan += iterCountThisLaunch; - m_Calls++; + atomItersRan.fetch_add(iterCountThisLaunch); + atomItersRemaining.store(iterCount - atomItersRan.load()); + m_Devices[dev]->m_Calls++; - if (m_Callback) + if (m_Callback && !dev)//Will only do callback on the first device, however it will report the progress of all devices. { - percent = 100.0 * + double percent = 100.0 * double ( double ( double ( - double(m_LastIter + itersRan) / double(ItersPerTemporalSample()) + double(m_LastIter + atomItersRan.load()) / double(ItersPerTemporalSample()) ) + temporalSample ) / double(TemporalSamples()) ); @@ -896,7 +1027,7 @@ bool RendererCL::RunIter(size_t iterCount, size_t temporalSample, si if (percentDiff >= 10 || (toc > 1000 && percentDiff >= 1))//Call callback function if either 10% has passed, or one second (and 1%). { - etaMs = ((100.0 - percent) / percent) * m_RenderTimer.Toc(); + double etaMs = ((100.0 - percent) / percent) * m_RenderTimer.Toc(); if (!m_Callback->ProgressFunc(m_Ember, m_ProgressParameter, percent, 0, etaMs)) Abort(); @@ -906,71 +1037,105 @@ bool RendererCL::RunIter(size_t iterCount, size_t temporalSample, si } } } - } - else + }; + + //Iterate backward to run all secondary devices on threads first, then finally the primary device on this thread. + for (size_t device = m_Devices.size() - 1; device >= 0 && device < m_Devices.size(); device--)//Check both extents because size_t will wrap. { - b = false; - this->m_ErrorReport.push_back(loc); + int index = m_Devices[device]->m_Wrapper.FindKernelIndex(m_IterOpenCLKernelCreator.IterEntryPoint()); + + if (index == -1) + { + success = false; + break; + } + + //If animating, treat each temporal sample as a newly started render for fusing purposes. + if (temporalSample > 0) + m_Devices[device]->m_Calls = 0; + + if (device != 0)//Secondary devices on their own threads. + threadVec.push_back(std::thread([&](size_t dev, int kernelIndex) { iterFunc(dev, kernelIndex); }, device, index)); + else//Primary device on this thread. + iterFunc(device, index); + } + + for (auto& th : threadVec) + if (th.joinable()) + th.join(); + + itersRan = atomItersRan.load(); + + if (m_Devices.size() > 1)//Determine whether/when to sum histograms of secondary devices with the primary. + { + if (((TemporalSamples() == 1) || (temporalSample == TemporalSamples() - 1)) &&//If there are no temporal samples (not animating), or the current one is the last... + ((m_LastIter + itersRan) >= ItersPerTemporalSample()))//...and the required number of iters for that sample have completed... + if (success && !(success = SumDeviceHist())) { this->m_ErrorReport.push_back(loc); }//...read the histogram from the secondary devices and sum them to the primary. } //t2.Toc(__FUNCTION__); - return b; + return success; } /// -/// Run the log scale filter. +/// Run the log scale filter on the primary device. /// /// True if success, else false. template eRenderStatus RendererCL::RunLogScaleFilter() { //Timing t(4); - bool b = true; - int kernelIndex = m_Wrapper.FindKernelIndex(m_DEOpenCLKernelCreator.LogScaleAssignDEEntryPoint()); - const char* loc = __FUNCTION__; + bool b = !m_Devices.empty(); - if (kernelIndex != -1) + if (b) { - ConvertDensityFilter(); - uint argIndex = 0; - uint blockW = m_WarpSize; - uint blockH = 4;//A height of 4 seems to run the fastest. - uint gridW = m_DensityFilterCL.m_SuperRasW; - uint gridH = m_DensityFilterCL.m_SuperRasH; + auto& wrapper = m_Devices[0]->m_Wrapper; + int kernelIndex = wrapper.FindKernelIndex(m_DEOpenCLKernelCreator.LogScaleAssignDEEntryPoint()); + const char* loc = __FUNCTION__; - OpenCLWrapper::MakeEvenGridDims(blockW, blockH, gridW, gridH); + if (kernelIndex != -1) + { + ConvertDensityFilter(); + cl_uint argIndex = 0; + size_t blockW = m_Devices[0]->WarpSize(); + size_t blockH = 4;//A height of 4 seems to run the fastest. + size_t gridW = m_DensityFilterCL.m_SuperRasW; + size_t gridH = m_DensityFilterCL.m_SuperRasH; - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_DEFilterParamsBufferName, reinterpret_cast(&m_DensityFilterCL), sizeof(m_DensityFilterCL)))) { this->m_ErrorReport.push_back(loc); } + OpenCLWrapper::MakeEvenGridDims(blockW, blockH, gridW, gridH); - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_HistBufferName))) { this->m_ErrorReport.push_back(loc); }//Histogram. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_AccumBufferName))) { this->m_ErrorReport.push_back(loc); }//Accumulator. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, m_DEFilterParamsBufferName))) { this->m_ErrorReport.push_back(loc); }//DensityFilterCL. + if (b && !(b = wrapper.AddAndWriteBuffer(m_DEFilterParamsBufferName, reinterpret_cast(&m_DensityFilterCL), sizeof(m_DensityFilterCL)))) { this->m_ErrorReport.push_back(loc); } - //t.Tic(); - if (b && !(b = m_Wrapper.RunKernel(kernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); } - //t.Toc(loc); + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_HistBufferName))) { this->m_ErrorReport.push_back(loc); }//Histogram. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_AccumBufferName))) { this->m_ErrorReport.push_back(loc); }//Accumulator. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_DEFilterParamsBufferName))) { this->m_ErrorReport.push_back(loc); }//DensityFilterCL. + + //t.Tic(); + if (b && !(b = wrapper.RunKernel(kernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); } + //t.Toc(loc); + } + else + { + b = false; + this->m_ErrorReport.push_back(loc); + } + + if (b && m_Callback && m_LastIterPercent >= 99.0)//Only update progress if we've really reached the end, not via forced output. + m_Callback->ProgressFunc(m_Ember, m_ProgressParameter, 100.0, 1, 0.0); } - else - { - b = false; - this->m_ErrorReport.push_back(loc); - } - - if (b && m_Callback && m_LastIterPercent >= 99.0)//Only update progress if we've really reached the end, not via forced output. - m_Callback->ProgressFunc(m_Ember, m_ProgressParameter, 100.0, 1, 0.0); return b ? RENDER_OK : RENDER_ERROR; } /// -/// Run the Gaussian density filter. -/// Method 7: Each block processes a 16x16(AMD) or 32x32(Nvidia) block and exits. No column or row advancements happen. +/// Run the Gaussian density filter on the primary device. +/// Method 7: Each block processes a 16x16(AMD) or 24x24(Nvidia) block and exits. No column or row advancements happen. /// /// True if success and not aborted, else false. template eRenderStatus RendererCL::RunDensityFilter() { - bool b = true; + bool b = !m_Devices.empty(); Timing t(4);// , t2(4); ConvertDensityFilter(); int kernelIndex = MakeAndGetDensityFilterProgram(Supersample(), m_DensityFilterCL.m_FilterWidth); @@ -982,17 +1147,18 @@ eRenderStatus RendererCL::RunDensityFilter() uint rightBound = m_DensityFilterCL.m_SuperRasW - (m_DensityFilterCL.m_Supersample - 1); uint topBound = leftBound; uint botBound = m_DensityFilterCL.m_SuperRasH - (m_DensityFilterCL.m_Supersample - 1); - uint gridW = rightBound - leftBound; - uint gridH = botBound - topBound; - uint blockSizeW = m_MaxDEBlockSizeW;//These *must* both be divisible by 16 or else pixels will go missing. - uint blockSizeH = m_MaxDEBlockSizeH; + size_t gridW = rightBound - leftBound; + size_t gridH = botBound - topBound; + size_t blockSizeW = m_MaxDEBlockSizeW;//These *must* both be divisible by 16 or else pixels will go missing. + size_t blockSizeH = m_MaxDEBlockSizeH; + auto& wrapper = m_Devices[0]->m_Wrapper; //OpenCL runs out of resources when using double or a supersample of 2. //Remedy this by reducing the height of the block by 2. if (m_DoublePrecision || m_DensityFilterCL.m_Supersample > 1) blockSizeH -= 2; - //Can't just blindly pass in vals. Must adjust them first to evenly divide the block count + //Can't just blindly pass dimension in vals. Must adjust them first to evenly divide the block count //into the total grid dimensions. OpenCLWrapper::MakeEvenGridDims(blockSizeW, blockSizeH, gridW, gridH); @@ -1009,7 +1175,7 @@ eRenderStatus RendererCL::RunDensityFilter() uint chunkSizeH = gapH + 1; double totalChunks = chunkSizeW * chunkSizeH; - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_DEFilterParamsBufferName, reinterpret_cast(&m_DensityFilterCL), sizeof(m_DensityFilterCL)))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddAndWriteBuffer(m_DEFilterParamsBufferName, reinterpret_cast(&m_DensityFilterCL), sizeof(m_DensityFilterCL)))) { this->m_ErrorReport.push_back(loc); } #ifdef ROW_ONLY_DE blockSizeW = 64;//These *must* both be divisible by 16 or else pixels will go missing. @@ -1082,7 +1248,7 @@ eRenderStatus RendererCL::RunDensityFilter() } /// -/// Run final accumulation to the 2D output image. +/// Run final accumulation to the 2D output image on the primary device. /// /// True if success and not aborted, else false. template @@ -1093,21 +1259,23 @@ eRenderStatus RendererCL::RunFinalAccum() double alphaBase; double alphaScale; int accumKernelIndex = MakeAndGetFinalAccumProgram(alphaBase, alphaScale); - uint argIndex; - uint gridW; - uint gridH; - uint blockW; - uint blockH; + cl_uint argIndex; + size_t gridW; + size_t gridH; + size_t blockW; + size_t blockH; uint curvesSet = m_CurvesSet ? 1 : 0; const char* loc = __FUNCTION__; if (!m_Abort && accumKernelIndex != -1) { + auto& wrapper = m_Devices[0]->m_Wrapper; + //This is needed with or without early clip. ConvertSpatialFilter(); - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_SpatialFilterParamsBufferName, reinterpret_cast(&m_SpatialFilterCL), sizeof(m_SpatialFilterCL)))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.AddAndWriteBuffer(m_CurvesCsaName, m_Csa.m_Entries.data(), SizeOf(m_Csa.m_Entries)))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddAndWriteBuffer(m_SpatialFilterParamsBufferName, reinterpret_cast(&m_SpatialFilterCL), sizeof(m_SpatialFilterCL)))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.AddAndWriteBuffer(m_CurvesCsaName, m_Csa.m_Entries.data(), SizeOf(m_Csa.m_Entries)))) { this->m_ErrorReport.push_back(loc); } //Since early clip requires gamma correcting the entire accumulator first, //it can't be done inside of the normal final accumulation kernel, so @@ -1119,16 +1287,16 @@ eRenderStatus RendererCL::RunFinalAccum() if (gammaCorrectKernelIndex != -1) { argIndex = 0; - blockW = m_WarpSize; + blockW = m_Devices[0]->WarpSize(); blockH = 4;//A height of 4 seems to run the fastest. gridW = m_SpatialFilterCL.m_SuperRasW;//Using super dimensions because this processes the density filtering bufer. gridH = m_SpatialFilterCL.m_SuperRasH; OpenCLWrapper::MakeEvenGridDims(blockW, blockH, gridW, gridH); - if (b && !(b = m_Wrapper.SetBufferArg(gammaCorrectKernelIndex, argIndex++, m_AccumBufferName))) { this->m_ErrorReport.push_back(loc); }//Accumulator. - if (b && !(b = m_Wrapper.SetBufferArg(gammaCorrectKernelIndex, argIndex++, m_SpatialFilterParamsBufferName))) { this->m_ErrorReport.push_back(loc); }//SpatialFilterCL. + if (b && !(b = wrapper.SetBufferArg(gammaCorrectKernelIndex, argIndex++, m_AccumBufferName))) { this->m_ErrorReport.push_back(loc); }//Accumulator. + if (b && !(b = wrapper.SetBufferArg(gammaCorrectKernelIndex, argIndex++, m_SpatialFilterParamsBufferName))) { this->m_ErrorReport.push_back(loc); }//SpatialFilterCL. - if (b && !(b = m_Wrapper.RunKernel(gammaCorrectKernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.RunKernel(gammaCorrectKernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); } } else { @@ -1138,29 +1306,29 @@ eRenderStatus RendererCL::RunFinalAccum() } argIndex = 0; - blockW = m_WarpSize; + blockW = m_Devices[0]->WarpSize(); blockH = 4;//A height of 4 seems to run the fastest. gridW = m_SpatialFilterCL.m_FinalRasW; gridH = m_SpatialFilterCL.m_FinalRasH; OpenCLWrapper::MakeEvenGridDims(blockW, blockH, gridW, gridH); - if (b && !(b = m_Wrapper.SetBufferArg(accumKernelIndex, argIndex++, m_AccumBufferName))) { this->m_ErrorReport.push_back(loc); }//Accumulator. - if (b && !(b = m_Wrapper.SetImageArg (accumKernelIndex, argIndex++, m_Wrapper.Shared(), m_FinalImageName))) { this->m_ErrorReport.push_back(loc); }//Final image. - if (b && !(b = m_Wrapper.SetBufferArg(accumKernelIndex, argIndex++, m_SpatialFilterParamsBufferName))) { this->m_ErrorReport.push_back(loc); }//SpatialFilterCL. - if (b && !(b = m_Wrapper.SetBufferArg(accumKernelIndex, argIndex++, m_SpatialFilterCoefsBufferName))) { this->m_ErrorReport.push_back(loc); }//Filter coefs. - if (b && !(b = m_Wrapper.SetBufferArg(accumKernelIndex, argIndex++, m_CurvesCsaName))) { this->m_ErrorReport.push_back(loc); }//Curve points. + if (b && !(b = wrapper.SetBufferArg(accumKernelIndex, argIndex++, m_AccumBufferName))) { this->m_ErrorReport.push_back(loc); }//Accumulator. + if (b && !(b = wrapper.SetImageArg(accumKernelIndex, argIndex++, wrapper.Shared(), m_FinalImageName))) { this->m_ErrorReport.push_back(loc); }//Final image. + if (b && !(b = wrapper.SetBufferArg(accumKernelIndex, argIndex++, m_SpatialFilterParamsBufferName))) { this->m_ErrorReport.push_back(loc); }//SpatialFilterCL. + if (b && !(b = wrapper.SetBufferArg(accumKernelIndex, argIndex++, m_SpatialFilterCoefsBufferName))) { this->m_ErrorReport.push_back(loc); }//Filter coefs. + if (b && !(b = wrapper.SetBufferArg(accumKernelIndex, argIndex++, m_CurvesCsaName))) { this->m_ErrorReport.push_back(loc); }//Curve points. - if (b && !(b = m_Wrapper.SetArg (accumKernelIndex, argIndex++, curvesSet))) { this->m_ErrorReport.push_back(loc); }//Do curves. - if (b && !(b = m_Wrapper.SetArg (accumKernelIndex, argIndex++, bucketT(alphaBase)))) { this->m_ErrorReport.push_back(loc); }//Alpha base. - if (b && !(b = m_Wrapper.SetArg (accumKernelIndex, argIndex++, bucketT(alphaScale)))) { this->m_ErrorReport.push_back(loc); }//Alpha scale. + if (b && !(b = wrapper.SetArg (accumKernelIndex, argIndex++, curvesSet))) { this->m_ErrorReport.push_back(loc); }//Do curves. + if (b && !(b = wrapper.SetArg (accumKernelIndex, argIndex++, bucketT(alphaBase)))) { this->m_ErrorReport.push_back(loc); }//Alpha base. + if (b && !(b = wrapper.SetArg (accumKernelIndex, argIndex++, bucketT(alphaScale)))) { this->m_ErrorReport.push_back(loc); }//Alpha scale. - if (b && m_Wrapper.Shared()) - if (b && !(b = m_Wrapper.EnqueueAcquireGLObjects(m_FinalImageName))) { this->m_ErrorReport.push_back(loc); } + if (b && wrapper.Shared()) + if (b && !(b = wrapper.EnqueueAcquireGLObjects(m_FinalImageName))) { this->m_ErrorReport.push_back(loc); } - if (b && !(b = m_Wrapper.RunKernel(accumKernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); } + if (b && !(b = wrapper.RunKernel(accumKernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); } - if (b && m_Wrapper.Shared()) - if (b && !(b = m_Wrapper.EnqueueReleaseGLObjects(m_FinalImageName))) { this->m_ErrorReport.push_back(loc); } + if (b && wrapper.Shared()) + if (b && !(b = wrapper.EnqueueReleaseGLObjects(m_FinalImageName))) { this->m_ErrorReport.push_back(loc); } //t.Toc((char*)loc); } @@ -1174,39 +1342,45 @@ eRenderStatus RendererCL::RunFinalAccum() } /// -/// Zeroize a buffer of the specified size. +/// Zeroize a buffer of the specified size on the specified device. /// +/// The index in the device buffer to clear /// Name of the buffer to clear /// Width in elements /// Height in elements /// Size of each element /// True if success, else false. template -bool RendererCL::ClearBuffer(const string& bufferName, uint width, uint height, uint elementSize) +bool RendererCL::ClearBuffer(size_t device, const string& bufferName, uint width, uint height, uint elementSize) { - bool b = true; - int kernelIndex = m_Wrapper.FindKernelIndex(m_IterOpenCLKernelCreator.ZeroizeEntryPoint()); - uint argIndex = 0; - const char* loc = __FUNCTION__; + bool b = false; - if (kernelIndex != -1) + if (device < m_Devices.size()) { - uint blockW = m_NVidia ? 32 : 16;//Max work group size is 256 on AMD, which means 16x16. - uint blockH = m_NVidia ? 32 : 16; - uint gridW = width * elementSize; - uint gridH = height; + auto& wrapper = m_Devices[device]->m_Wrapper; + int kernelIndex = wrapper.FindKernelIndex(m_IterOpenCLKernelCreator.ZeroizeEntryPoint()); + cl_uint argIndex = 0; + const char* loc = __FUNCTION__; - OpenCLWrapper::MakeEvenGridDims(blockW, blockH, gridW, gridH); + if (kernelIndex != -1) + { + size_t blockW = m_Devices[device]->Nvidia() ? 32 : 16;//Max work group size is 256 on AMD, which means 16x16. + size_t blockH = m_Devices[device]->Nvidia() ? 32 : 16; + size_t gridW = width * elementSize; + size_t gridH = height; - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex++, bufferName))) { this->m_ErrorReport.push_back(loc); }//Buffer of byte. - if (b && !(b = m_Wrapper.SetArg (kernelIndex, argIndex++, width * elementSize))) { this->m_ErrorReport.push_back(loc); }//Width. - if (b && !(b = m_Wrapper.SetArg (kernelIndex, argIndex++, height))) { this->m_ErrorReport.push_back(loc); }//Height. - if (b && !(b = m_Wrapper.RunKernel(kernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); } - } - else - { - b = false; - this->m_ErrorReport.push_back(loc); + b = true; + OpenCLWrapper::MakeEvenGridDims(blockW, blockH, gridW, gridH); + + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, bufferName))) { this->m_ErrorReport.push_back(loc); }//Buffer of byte. + if (b && !(b = wrapper.SetArg(kernelIndex, argIndex++, width * elementSize))) { this->m_ErrorReport.push_back(loc); }//Width. + if (b && !(b = wrapper.SetArg(kernelIndex, argIndex++, height))) { this->m_ErrorReport.push_back(loc); }//Height. + if (b && !(b = wrapper.RunKernel(kernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); } + } + else + { + this->m_ErrorReport.push_back(loc); + } } return b; @@ -1227,34 +1401,41 @@ bool RendererCL::ClearBuffer(const string& bufferName, uint width, u /// Column parity /// True if success, else false. template -bool RendererCL::RunDensityFilterPrivate(uint kernelIndex, uint gridW, uint gridH, uint blockW, uint blockH, uint chunkSizeW, uint chunkSizeH, uint chunkW, uint chunkH) +bool RendererCL::RunDensityFilterPrivate(size_t kernelIndex, size_t gridW, size_t gridH, size_t blockW, size_t blockH, uint chunkSizeW, uint chunkSizeH, uint chunkW, uint chunkH) { //Timing t(4); bool b = true; - uint argIndex = 0; + cl_uint argIndex = 0; const char* loc = __FUNCTION__; - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex, m_HistBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Histogram. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex, m_AccumBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Accumulator. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex, m_DEFilterParamsBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//FlameDensityFilterCL. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex, m_DECoefsBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Coefs. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex, m_DEWidthsBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Widths. - if (b && !(b = m_Wrapper.SetBufferArg(kernelIndex, argIndex, m_DECoefIndicesBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Coef indices. - if (b && !(b = m_Wrapper.SetArg( kernelIndex, argIndex, chunkSizeW))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Chunk size width (gapW + 1). - if (b && !(b = m_Wrapper.SetArg( kernelIndex, argIndex, chunkSizeH))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Chunk size height (gapH + 1). - if (b && !(b = m_Wrapper.SetArg( kernelIndex, argIndex, chunkW))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Column chunk. - if (b && !(b = m_Wrapper.SetArg( kernelIndex, argIndex, chunkH))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Row chunk. - //t.Toc(__FUNCTION__ " set args"); + if (!m_Devices.empty()) + { + auto& wrapper = m_Devices[0]->m_Wrapper; - //t.Tic(); - if (b && !(b = m_Wrapper.RunKernel(kernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); }//Method 7, accumulating to temp box area. - //t.Toc(__FUNCTION__ " RunKernel()"); + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex, m_HistBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Histogram. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex, m_AccumBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Accumulator. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex, m_DEFilterParamsBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//FlameDensityFilterCL. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex, m_DECoefsBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Coefs. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex, m_DEWidthsBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Widths. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex, m_DECoefIndicesBufferName))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Coef indices. + if (b && !(b = wrapper.SetArg(kernelIndex, argIndex, chunkSizeW))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Chunk size width (gapW + 1). + if (b && !(b = wrapper.SetArg(kernelIndex, argIndex, chunkSizeH))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Chunk size height (gapH + 1). + if (b && !(b = wrapper.SetArg(kernelIndex, argIndex, chunkW))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Column chunk. + if (b && !(b = wrapper.SetArg(kernelIndex, argIndex, chunkH))) { this->m_ErrorReport.push_back(loc); } argIndex++;//Row chunk. + //t.Toc(__FUNCTION__ " set args"); - return b; + //t.Tic(); + if (b && !(b = wrapper.RunKernel(kernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { this->m_ErrorReport.push_back(loc); }//Method 7, accumulating to temp box area. + //t.Toc(__FUNCTION__ " RunKernel()"); + + return b; + } + + return false; } /// -/// Make the Gaussian density filter program and return its index. +/// Make the Gaussian density filter program on the primary device and return its index. /// /// The supersample being used for the current ember /// Width of the gaussian filter @@ -1262,22 +1443,22 @@ bool RendererCL::RunDensityFilterPrivate(uint kernelIndex, uint grid template int RendererCL::MakeAndGetDensityFilterProgram(size_t ss, uint filterWidth) { - auto& deEntryPoint = m_DEOpenCLKernelCreator.GaussianDEEntryPoint(ss, filterWidth); - int kernelIndex = m_Wrapper.FindKernelIndex(deEntryPoint); - const char* loc = __FUNCTION__; + int kernelIndex = -1; - if (kernelIndex == -1)//Has not been built yet. + if (!m_Devices.empty()) { - auto& kernel = m_DEOpenCLKernelCreator.GaussianDEKernel(ss, filterWidth); - bool b = m_Wrapper.AddProgram(deEntryPoint, kernel, deEntryPoint, m_DoublePrecision); + auto& wrapper = m_Devices[0]->m_Wrapper; + auto& deEntryPoint = m_DEOpenCLKernelCreator.GaussianDEEntryPoint(ss, filterWidth); + const char* loc = __FUNCTION__; - if (b) + if ((kernelIndex = wrapper.FindKernelIndex(deEntryPoint)) == -1)//Has not been built yet. { - kernelIndex = m_Wrapper.FindKernelIndex(deEntryPoint);//Try to find it again, it will be present if successfully built. - } - else - { - this->m_ErrorReport.push_back(string(loc) + "():\nBuilding the following program failed: \n" + kernel + "\n"); + auto& kernel = m_DEOpenCLKernelCreator.GaussianDEKernel(ss, filterWidth); + + if (wrapper.AddProgram(deEntryPoint, kernel, deEntryPoint, m_DoublePrecision)) + kernelIndex = wrapper.FindKernelIndex(deEntryPoint);//Try to find it again, it will be present if successfully built. + else + this->m_ErrorReport.push_back(string(loc) + "():\nBuilding the following program failed: \n" + kernel + "\n"); } } @@ -1285,7 +1466,7 @@ int RendererCL::MakeAndGetDensityFilterProgram(size_t ss, uint filte } /// -/// Make the final accumulation program and return its index. +/// Make the final accumulation on the primary device program and return its index. /// There are many different kernels for final accum, depending on early clip, alpha channel, and transparency. /// Loading all of these in the beginning is too much, so only load the one for the current case being worked with. /// @@ -1295,47 +1476,122 @@ int RendererCL::MakeAndGetDensityFilterProgram(size_t ss, uint filte template int RendererCL::MakeAndGetFinalAccumProgram(double& alphaBase, double& alphaScale) { - auto& finalAccumEntryPoint = m_FinalAccumOpenCLKernelCreator.FinalAccumEntryPoint(EarlyClip(), Renderer::NumChannels(), Transparency(), alphaBase, alphaScale); - int kernelIndex = m_Wrapper.FindKernelIndex(finalAccumEntryPoint); - const char* loc = __FUNCTION__; + int kernelIndex = -1; - if (kernelIndex == -1)//Has not been built yet. + if (!m_Devices.empty()) { - auto& kernel = m_FinalAccumOpenCLKernelCreator.FinalAccumKernel(EarlyClip(), Renderer::NumChannels(), Transparency()); - bool b = m_Wrapper.AddProgram(finalAccumEntryPoint, kernel, finalAccumEntryPoint, m_DoublePrecision); + auto& wrapper = m_Devices[0]->m_Wrapper; + auto& finalAccumEntryPoint = m_FinalAccumOpenCLKernelCreator.FinalAccumEntryPoint(EarlyClip(), Renderer::NumChannels(), Transparency(), alphaBase, alphaScale); + const char* loc = __FUNCTION__; - if (b) - kernelIndex = m_Wrapper.FindKernelIndex(finalAccumEntryPoint);//Try to find it again, it will be present if successfully built. - else - this->m_ErrorReport.push_back(loc); + if ((kernelIndex = wrapper.FindKernelIndex(finalAccumEntryPoint)) == -1)//Has not been built yet. + { + auto& kernel = m_FinalAccumOpenCLKernelCreator.FinalAccumKernel(EarlyClip(), Renderer::NumChannels(), Transparency()); + if (wrapper.AddProgram(finalAccumEntryPoint, kernel, finalAccumEntryPoint, m_DoublePrecision)) + kernelIndex = wrapper.FindKernelIndex(finalAccumEntryPoint);//Try to find it again, it will be present if successfully built. + else + this->m_ErrorReport.push_back(loc); + } } return kernelIndex; } /// -/// Make the gamma correction program for early clipping and return its index. +/// Make the gamma correction program on the primary device for early clipping and return its index. /// /// The kernel index if successful, else -1. template int RendererCL::MakeAndGetGammaCorrectionProgram() { - auto& gammaEntryPoint = m_FinalAccumOpenCLKernelCreator.GammaCorrectionEntryPoint(Renderer::NumChannels(), Transparency()); - int kernelIndex = m_Wrapper.FindKernelIndex(gammaEntryPoint); - const char* loc = __FUNCTION__; - - if (kernelIndex == -1)//Has not been built yet. + if (!m_Devices.empty()) { - auto& kernel = m_FinalAccumOpenCLKernelCreator.GammaCorrectionKernel(Renderer::NumChannels(), Transparency()); - bool b = m_Wrapper.AddProgram(gammaEntryPoint, kernel, gammaEntryPoint, m_DoublePrecision); + auto& wrapper = m_Devices[0]->m_Wrapper; + auto& gammaEntryPoint = m_FinalAccumOpenCLKernelCreator.GammaCorrectionEntryPoint(Renderer::NumChannels(), Transparency()); + int kernelIndex = wrapper.FindKernelIndex(gammaEntryPoint); + const char* loc = __FUNCTION__; - if (b) - kernelIndex = m_Wrapper.FindKernelIndex(gammaEntryPoint);//Try to find it again, it will be present if successfully built. - else - this->m_ErrorReport.push_back(loc); + if (kernelIndex == -1)//Has not been built yet. + { + auto& kernel = m_FinalAccumOpenCLKernelCreator.GammaCorrectionKernel(Renderer::NumChannels(), Transparency()); + bool b = wrapper.AddProgram(gammaEntryPoint, kernel, gammaEntryPoint, m_DoublePrecision); + + if (b) + kernelIndex = wrapper.FindKernelIndex(gammaEntryPoint);//Try to find it again, it will be present if successfully built. + else + this->m_ErrorReport.push_back(loc); + } + + return kernelIndex; } - return kernelIndex; + return -1; +} + +/// +/// Sum all histograms from the secondary devices with the histogram on the primary device. +/// +/// True if success, else false. +template +bool RendererCL::SumDeviceHist() +{ + if (m_Devices.size() > 1) + { + //Timing t; + bool b = true; + auto& wrapper = m_Devices[0]->m_Wrapper; + const char* loc = __FUNCTION__; + size_t blockW = m_Devices[0]->Nvidia() ? 32 : 16;//Max work group size is 256 on AMD, which means 16x16. + size_t blockH = m_Devices[0]->Nvidia() ? 32 : 16; + size_t gridW = SuperRasW(); + size_t gridH = SuperRasH(); + OpenCLWrapper::MakeEvenGridDims(blockW, blockH, gridW, gridH); + int kernelIndex = wrapper.FindKernelIndex(m_IterOpenCLKernelCreator.SumHistEntryPoint()); + + if ((b = (kernelIndex != -1))) + { + for (size_t device = 1; device < m_Devices.size(); device++) + { + if ((b = (ReadHist(device) && ClearHist(device))))//Must clear hist on secondary devices after reading and summing because they'll be reused on a quality increase (KEEP_ITERATING). + { + if ((b = wrapper.WriteBuffer(m_AccumBufferName, reinterpret_cast(HistBuckets()), SuperSize() * sizeof(v4bT)))) + { + cl_uint argIndex = 0; + + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_AccumBufferName))) { break; }//Source buffer of v4bT. + if (b && !(b = wrapper.SetBufferArg(kernelIndex, argIndex++, m_HistBufferName))) { break; }//Dest buffer of v4bT. + if (b && !(b = wrapper.SetArg (kernelIndex, argIndex++, uint(SuperRasW())))) { break; }//Width in pixels. + if (b && !(b = wrapper.SetArg (kernelIndex, argIndex++, uint(SuperRasH())))) { break; }//Height in pixels. + if (b && !(b = wrapper.SetArg (kernelIndex, argIndex++, (device == m_Devices.size() - 1) ? 1 : 0))) { break; }//Clear the source buffer on the last device. + if (b && !(b = wrapper.RunKernel (kernelIndex, gridW, gridH, 1, blockW, blockH, 1))) { break; } + } + else + { + break; + } + } + else + { + break; + } + } + } + + if (!b) + { + ostringstream os; + + os << loc << ": failed to sum histograms from the secondary device(s) to the primary device."; + this->m_ErrorReport.push_back(os.str()); + } + + //t.Toc(loc); + return b; + } + else + { + return m_Devices.size() == 1; + } } /// @@ -1343,20 +1599,22 @@ int RendererCL::MakeAndGetGammaCorrectionProgram() /// /// -/// Convert the currently used host side DensityFilter object into a DensityFilterCL object +/// Convert the currently used host side DensityFilter object into the DensityFilterCL member /// for passing to OpenCL. +/// Some of the values are note populated when the filter object is null. This will be the case +/// when only log scaling is needed. /// -/// The DensityFilterCL object template void RendererCL::ConvertDensityFilter() { + m_DensityFilterCL.m_Supersample = uint(Supersample()); + m_DensityFilterCL.m_SuperRasW = uint(SuperRasW()); + m_DensityFilterCL.m_SuperRasH = uint(SuperRasH()); + m_DensityFilterCL.m_K1 = K1(); + m_DensityFilterCL.m_K2 = K2(); + if (m_DensityFilter.get()) { - m_DensityFilterCL.m_Supersample = uint(Supersample()); - m_DensityFilterCL.m_SuperRasW = uint(SuperRasW()); - m_DensityFilterCL.m_SuperRasH = uint(SuperRasH()); - m_DensityFilterCL.m_K1 = K1(); - m_DensityFilterCL.m_K2 = K2(); m_DensityFilterCL.m_Curve = m_DensityFilter->Curve(); m_DensityFilterCL.m_KernelSize = uint(m_DensityFilter->KernelSize()); m_DensityFilterCL.m_MaxFilterIndex = uint(m_DensityFilter->MaxFilterIndex()); @@ -1366,10 +1624,9 @@ void RendererCL::ConvertDensityFilter() } /// -/// Convert the currently used host side SpatialFilter object into a SpatialFilterCL object +/// Convert the currently used host side SpatialFilter object into the SpatialFilterCL member /// for passing to OpenCL. /// -/// The SpatialFilterCL object template void RendererCL::ConvertSpatialFilter() { @@ -1425,7 +1682,7 @@ void RendererCL::ConvertEmber(Ember& ember, EmberCL& emberCL, emberCL.m_CamDepthBlur = ember.m_CamDepthBlur; emberCL.m_BlurCoef = ember.BlurCoef(); - for (uint i = 0; i < ember.TotalXformCount() && i < xformsCL.size(); i++) + for (size_t i = 0; i < ember.TotalXformCount() && i < xformsCL.size(); i++) { Xform* xform = ember.GetTotalXform(i); @@ -1449,17 +1706,16 @@ void RendererCL::ConvertEmber(Ember& ember, EmberCL& emberCL, xformsCL[i].m_Opacity = xform->m_Opacity; xformsCL[i].m_VizAdjusted = xform->VizAdjusted(); - for (uint varIndex = 0; varIndex < xform->TotalVariationCount() && varIndex < MAX_CL_VARS; varIndex++)//Assign all variation weights for this xform, with a max of MAX_CL_VARS. + for (size_t varIndex = 0; varIndex < xform->TotalVariationCount() && varIndex < MAX_CL_VARS; varIndex++)//Assign all variation weights for this xform, with a max of MAX_CL_VARS. xformsCL[i].m_VariationWeights[varIndex] = xform->GetVariation(varIndex)->m_Weight; } } /// -/// Convert the host side CarToRas object into a CarToRasCL object +/// Convert the host side CarToRas object into the CarToRasCL member /// for passing to OpenCL. /// /// The CarToRas object to convert -/// The CarToRasCL object template void RendererCL::ConvertCarToRas(const CarToRas& carToRas) { @@ -1475,7 +1731,8 @@ void RendererCL::ConvertCarToRas(const CarToRas& carToRas) } /// -/// Fill seeds buffer which gets passed to the iteration kernel. +/// Fill a seeds buffer for all devices, each of which gets passed to its +/// respective device when launching the iteration kernel. /// The range of each seed will be spaced to ensure no duplicates are added. /// Note, WriteBuffer() must be called after this to actually copy the /// data from the host to the device. @@ -1483,16 +1740,24 @@ void RendererCL::ConvertCarToRas(const CarToRas& carToRas) template void RendererCL::FillSeeds() { - double start, delta = std::floor((double)std::numeric_limits::max() / (IterGridKernelCount() * 2)); - m_Seeds.resize(IterGridKernelCount()); - start = delta; - - for (auto& seed : m_Seeds) + if (!m_Devices.empty()) { - seed.x = (uint)m_Rand[0].template Frand(start, start + delta); - start += delta; - seed.y = (uint)m_Rand[0].template Frand(start, start + delta); - start += delta; + double start, delta = std::floor(double(std::numeric_limits::max()) / (IterGridKernelCount() * 2 * m_Devices.size())); + m_Seeds.resize(m_Devices.size()); + start = delta; + + for (size_t device = 0; device < m_Devices.size(); device++) + { + m_Seeds[device].resize(IterGridKernelCount()); + + for (auto& seed : m_Seeds[device]) + { + seed.x = uint(m_Rand[0].template Frand(start, start + delta)); + start += delta; + seed.y = uint(m_Rand[0].template Frand(start, start + delta)); + start += delta; + } + } } } diff --git a/Source/EmberCL/RendererCL.h b/Source/EmberCL/RendererCL.h index bf714ba..507bb4c 100644 --- a/Source/EmberCL/RendererCL.h +++ b/Source/EmberCL/RendererCL.h @@ -2,9 +2,9 @@ #include "EmberCLPch.h" #include "OpenCLWrapper.h" -#include "IterOpenCLKernelCreator.h" #include "DEOpenCLKernelCreator.h" #include "FinalAccumOpenCLKernelCreator.h" +#include "RendererClDevice.h" /// /// RendererCLBase and RendererCL classes. @@ -26,12 +26,17 @@ public: /// /// RendererCL is a derivation of the basic CPU renderer which /// overrides various functions to render on the GPU using OpenCL. +/// This supports multi-GPU rendering and is done in the following manner: +/// -When rendering a single image, the iterations will be split between devices in sub batches. +/// -When animating, a renderer for each device will be created by the calling code, +/// and the frames will each be rendered by a single device as available. +/// The synchronization across devices is done through a single atomic counter. /// Since this class derives from EmberReport and also contains an /// OpenCLWrapper member which also derives from EmberReport, the /// reporting functions are overridden to aggregate the errors from /// both sources. -/// It does not support different types for T and bucketT, so it only has one template argument -/// and uses both for the base. +/// Template argument T expected to be float or double. +/// Template argument bucketT must always be float. /// template class EMBERCL_API RendererCL : public Renderer, public RendererCLBase @@ -65,6 +70,8 @@ using EmberNs::Renderer::RendererBase::m_RenderTimer; using EmberNs::Renderer::RendererBase::m_IterTimer; using EmberNs::Renderer::RendererBase::m_ProgressTimer; using EmberNs::Renderer::RendererBase::EmberReport::m_ErrorReport; +using EmberNs::Renderer::RendererBase::m_ResizeCs; +using EmberNs::Renderer::RendererBase::m_ProcessAction; using EmberNs::Renderer::m_RotMat; using EmberNs::Renderer::m_Ember; using EmberNs::Renderer::m_Csa; @@ -82,45 +89,45 @@ using EmberNs::Renderer::GetSpatialFilter; using EmberNs::Renderer::CoordMap; using EmberNs::Renderer::XformDistributions; using EmberNs::Renderer::XformDistributionsSize; +using EmberNs::Renderer::m_Dmap; using EmberNs::Renderer::m_DensityFilter; using EmberNs::Renderer::m_SpatialFilter; public: - RendererCL(uint platform = 0, uint device = 0, bool shared = false, GLuint outputTexID = 0); + RendererCL(const vector>& devices, bool shared = false, GLuint outputTexID = 0); ~RendererCL(); //Non-virtual member functions for OpenCL specific tasks. - bool Init(uint platform, uint device, bool shared, GLuint outputTexID); + bool Init(const vector>& devices, bool shared, GLuint outputTexID); bool SetOutputTexture(GLuint outputTexID); //Iters per kernel/block/grid. - inline uint IterCountPerKernel() const; - inline uint IterCountPerBlock() const; - inline uint IterCountPerGrid() const; + inline size_t IterCountPerKernel() const; + inline size_t IterCountPerBlock() const; + inline size_t IterCountPerGrid() const; //Kernels per block. - inline uint IterBlockKernelWidth() const; - inline uint IterBlockKernelHeight() const; - inline uint IterBlockKernelCount() const; + inline size_t IterBlockKernelWidth() const; + inline size_t IterBlockKernelHeight() const; + inline size_t IterBlockKernelCount() const; //Kernels per grid. - inline uint IterGridKernelWidth() const; - inline uint IterGridKernelHeight() const; - inline uint IterGridKernelCount() const; + inline size_t IterGridKernelWidth() const; + inline size_t IterGridKernelHeight() const; + inline size_t IterGridKernelCount() const; //Blocks per grid. - inline uint IterGridBlockWidth() const; - inline uint IterGridBlockHeight() const; - inline uint IterGridBlockCount() const; + inline size_t IterGridBlockWidth() const; + inline size_t IterGridBlockHeight() const; + inline size_t IterGridBlockCount() const; - uint PlatformIndex(); - uint DeviceIndex(); - bool ReadHist(); + bool ReadHist(size_t device); bool ReadAccum(); - bool ReadPoints(vector>& vec); + bool ReadPoints(size_t device, vector>& vec); bool ClearHist(); + bool ClearHist(size_t device); bool ClearAccum(); - bool WritePoints(vector>& vec); + bool WritePoints(size_t device, vector>& vec); #ifdef TEST_CL bool WriteRandomPoints(); #endif @@ -136,7 +143,6 @@ public: virtual size_t MemoryAvailable() override; virtual bool Ok() const override; virtual void NumChannels(size_t numChannels) override; - virtual void DumpErrorReport() override; virtual void ClearErrorReport() override; virtual size_t SubBatchSize() const override; virtual size_t ThreadCount() const override; @@ -151,8 +157,7 @@ public: protected: #endif //Protected virtual functions overridden from Renderer. - virtual void MakeDmap(T colorScalar) override; - virtual bool Alloc() override; + virtual bool Alloc(bool histOnly = false) override; virtual bool ResetBuckets(bool resetHist = true, bool resetAccum = true) override; virtual eRenderStatus LogScaleDensityFilter() override; virtual eRenderStatus GaussianDensityFilter() override; @@ -162,17 +167,19 @@ protected: #ifndef TEST_CL private: #endif + void Init(); //Private functions for making and running OpenCL programs. bool BuildIterProgramForEmber(bool doAccum = true); bool RunIter(size_t iterCount, size_t temporalSample, size_t& itersRan); eRenderStatus RunLogScaleFilter(); eRenderStatus RunDensityFilter(); eRenderStatus RunFinalAccum(); - bool ClearBuffer(const string& bufferName, uint width, uint height, uint elementSize); - bool RunDensityFilterPrivate(uint kernelIndex, uint gridW, uint gridH, uint blockW, uint blockH, uint chunkSizeW, uint chunkSizeH, uint chunkW, uint chunkH); + bool ClearBuffer(size_t device, const string& bufferName, uint width, uint height, uint elementSize); + bool RunDensityFilterPrivate(size_t kernelIndex, size_t gridW, size_t gridH, size_t blockW, size_t blockH, uint chunkSizeW, uint chunkSizeH, uint chunkW, uint chunkH); int MakeAndGetDensityFilterProgram(size_t ss, uint filterWidth); int MakeAndGetFinalAccumProgram(double& alphaBase, double& alphaScale); int MakeAndGetGammaCorrectionProgram(); + bool SumDeviceHist(); void FillSeeds(); //Private functions passing data to OpenCL programs. @@ -182,15 +189,12 @@ private: void ConvertCarToRas(const CarToRas& carToRas); bool m_Init; - bool m_NVidia; bool m_DoublePrecision; - uint m_IterCountPerKernel; - uint m_IterBlocksWide, m_IterBlockWidth; - uint m_IterBlocksHigh, m_IterBlockHeight; - uint m_MaxDEBlockSizeW; - uint m_MaxDEBlockSizeH; - uint m_WarpSize; - size_t m_Calls; + size_t m_IterCountPerKernel; + size_t m_IterBlocksWide, m_IterBlockWidth; + size_t m_IterBlocksHigh, m_IterBlockHeight; + size_t m_MaxDEBlockSizeW; + size_t m_MaxDEBlockSizeH; //Buffer names. string m_EmberBufferName; @@ -214,7 +218,6 @@ private: //Kernels. string m_IterKernel; - OpenCLWrapper m_Wrapper; cl::ImageFormat m_PaletteFormat; cl::ImageFormat m_FinalFormat; cl::Image2D m_Palette; @@ -222,8 +225,7 @@ private: GLuint m_OutputTexID; EmberCL m_EmberCL; vector> m_XformsCL; - vector m_Seeds; - Palette m_DmapCL;//Used instead of the base class' m_Dmap because OpenCL only supports float textures. Likely not needed if we switch to float only hist. + vector> m_Seeds; CarToRasCL m_CarToRasCL; DensityFilterCL m_DensityFilterCL; SpatialFilterCL m_SpatialFilterCL; @@ -231,6 +233,7 @@ private: DEOpenCLKernelCreator m_DEOpenCLKernelCreator; FinalAccumOpenCLKernelCreator m_FinalAccumOpenCLKernelCreator; pair> m_Params; + vector> m_Devices; Ember m_LastBuiltEmber; }; } diff --git a/Source/EmberCL/RendererClDevice.cpp b/Source/EmberCL/RendererClDevice.cpp new file mode 100644 index 0000000..b3f860b --- /dev/null +++ b/Source/EmberCL/RendererClDevice.cpp @@ -0,0 +1,60 @@ +#include "EmberCLPch.h" +#include "RendererClDevice.h" + +namespace EmberCLns +{ +/// +/// Constructor that assigns members. +/// The object is not fully initialized at this point, the caller +/// must manually call Init(). +/// +/// The index of the platform to use +/// The index device of the device to use +/// True if shared with OpenGL, else false. +/// True if success, else false. +RendererClDevice::RendererClDevice(bool doublePrec, size_t platform, size_t device, bool shared) + : m_Info(OpenCLInfo::Instance()) +{ + m_Init = false; + m_Shared = shared; + m_NVidia = false; + m_WarpSize = 0; + m_Calls = 0; + m_PlatformIndex = platform; + m_DeviceIndex = device; +} + +/// +/// Initialization of the OpenCLWrapper member. +/// +/// True if success, else false. +bool RendererClDevice::Init() +{ + bool b = true; + + if (!m_Wrapper.Ok()) + { + m_Init = false; + b = m_Wrapper.Init(m_PlatformIndex, m_DeviceIndex, m_Shared); + } + + if (b && m_Wrapper.Ok() && !m_Init) + { + m_NVidia = ToLower(m_Info.PlatformName(m_PlatformIndex)).find_first_of("nvidia") != string::npos && m_Wrapper.LocalMemSize() > (32 * 1024); + m_WarpSize = m_NVidia ? 32 : 64; + m_Init = true; + } + + return b; +} + +/// +/// OpenCL property accessors, getters only. +/// +bool RendererClDevice::Ok() const { return m_Init; } +bool RendererClDevice::Shared() const { return m_Shared; } +bool RendererClDevice::Nvidia() const { return m_NVidia; } +size_t RendererClDevice::WarpSize() const { return m_WarpSize; } +size_t RendererClDevice::PlatformIndex() const { return m_PlatformIndex; } +size_t RendererClDevice::DeviceIndex() const { return m_DeviceIndex; } +} diff --git a/Source/EmberCL/RendererClDevice.h b/Source/EmberCL/RendererClDevice.h new file mode 100644 index 0000000..bc22104 --- /dev/null +++ b/Source/EmberCL/RendererClDevice.h @@ -0,0 +1,42 @@ +#pragma once + +#include "EmberCLPch.h" +#include "OpenCLWrapper.h" +#include "IterOpenCLKernelCreator.h" + +/// +/// RendererClDevice class. +/// + +namespace EmberCLns +{ +/// +/// Class to manage a device that does the iteration portion of +/// the rendering process. Having a separate class for this purpose +/// enables multi-GPU support. +/// +class EMBERCL_API RendererClDevice : public EmberReport +{ +public: + RendererClDevice(bool doublePrec, size_t platform, size_t device, bool shared); + bool Init(); + bool Ok() const; + bool Shared() const; + bool Nvidia() const; + size_t WarpSize() const; + size_t PlatformIndex() const; + size_t DeviceIndex() const; + + size_t m_Calls; + OpenCLWrapper m_Wrapper; + +private: + bool m_Init; + bool m_Shared; + bool m_NVidia; + size_t m_WarpSize; + size_t m_PlatformIndex; + size_t m_DeviceIndex; + OpenCLInfo& m_Info; +}; +} diff --git a/Source/EmberCommon/EmberCommon.h b/Source/EmberCommon/EmberCommon.h index b8dd006..0f641c7 100644 --- a/Source/EmberCommon/EmberCommon.h +++ b/Source/EmberCommon/EmberCommon.h @@ -84,7 +84,7 @@ private: /// True to use defaults if they are not present in the file, else false to use invalid values as placeholders to indicate the values were not present. Default: true. /// True if success, else false. template -static bool ParseEmberFile(XmlToEmber& parser, string filename, vector>& embers, bool useDefaults = true) +static bool ParseEmberFile(XmlToEmber& parser, const string& filename, vector>& embers, bool useDefaults = true) { if (!parser.Parse(filename.c_str(), embers, useDefaults)) { @@ -138,7 +138,7 @@ static void RgbaToRgb(vector& rgba, vector& rgb, size_t width, size_ if (rgba.data() != rgb.data())//Only resize the destination buffer if they are different. rgb.resize(width * height * 3); - for (uint i = 0, j = 0; i < (width * height * 4); i += 4, j += 3) + for (size_t i = 0, j = 0; i < (width * height * 4); i += 4, j += 3) { rgb[j] = rgba[i]; rgb[j + 1] = rgba[i + 1]; @@ -231,35 +231,52 @@ static T NextLowestEvenDiv(T numerator, T denominator) return result; } +/// +/// Wrapper for converting a vector of absolute device indices to a vector +/// of platform,device index pairs. +/// +/// The vector of absolute device indices to convert +/// The converted vector of platform,device index pairs +static vector> Devices(const vector& selectedDevices) +{ + vector> vec; + OpenCLInfo& info = OpenCLInfo::Instance(); + auto& devices = info.DeviceIndices(); + + vec.reserve(selectedDevices.size()); + + for (size_t i = 0; i < selectedDevices.size(); i++) + { + auto index = selectedDevices[i]; + + if (index < devices.size()) + vec.push_back(devices[index]); + } + + return vec; +} + /// /// Wrapper for creating a renderer of the specified type. -/// First template argument expected to be float or double for CPU renderer, -/// Second argument expected to be float or double for CPU renderer, and only float for OpenCL renderer. /// /// Type of renderer to create -/// The index platform of the platform to use -/// The index device of the device to use +/// The vector of platform/device indices to use /// True if shared with OpenGL, else false. /// The texture ID of the shared OpenGL texture if shared /// The error report for holding errors if anything goes wrong /// A pointer to the created renderer if successful, else false. -template -static Renderer* CreateRenderer(eRendererType renderType, uint platform, uint device, bool shared, GLuint texId, EmberReport& errorReport) +template +static Renderer* CreateRenderer(eRendererType renderType, const vector>& devices, bool shared, GLuint texId, EmberReport& errorReport) { string s; - unique_ptr> renderer; + unique_ptr> renderer; try { - if (renderType == CPU_RENDERER) - { - s = "CPU"; - renderer = unique_ptr>(new Renderer()); - } - else if (renderType == OPENCL_RENDERER) + if (renderType == OPENCL_RENDERER && !devices.empty()) { s = "OpenCL"; - renderer = unique_ptr>(new RendererCL(platform, device, shared, texId)); + renderer = unique_ptr>(new RendererCL(devices, shared, texId)); if (!renderer.get() || !renderer->Ok()) { @@ -267,9 +284,18 @@ static Renderer* CreateRenderer(eRendererType renderType, uint platf errorReport.AddToReport(renderer->ErrorReport()); errorReport.AddToReport("Error initializing OpenCL renderer, using CPU renderer instead."); - renderer = unique_ptr>(new Renderer()); + renderer = unique_ptr>(new Renderer()); } } + else + { + s = "CPU"; + renderer = unique_ptr>(new Renderer()); + } + } + catch (const std::exception& e) + { + errorReport.AddToReport("Error creating " + s + " renderer: " + e.what() + "\n"); } catch (...) { @@ -279,6 +305,100 @@ static Renderer* CreateRenderer(eRendererType renderType, uint platf return renderer.release(); } +/// +/// Wrapper for creating a vector of renderers of the specified type for each passed in device. +/// If shared is true, only the first renderer will be shared with OpenGL. +/// Although a fallback GPU renderer will be created if a failure occurs, it doesn't really +/// make sense since the concept of devices only applies to OpenCL renderers. +/// +/// Type of renderer to create +/// The vector of platform/device indices to use +/// True if shared with OpenGL, else false. +/// The texture ID of the shared OpenGL texture if shared +/// The error report for holding errors if anything goes wrong +/// The vector of created renderers if successful, else false. +template +static vector>> CreateRenderers(eRendererType renderType, const vector>& devices, bool shared, GLuint texId, EmberReport& errorReport) +{ + string s; + vector>> v; + + try + { + if (renderType == OPENCL_RENDERER && !devices.empty()) + { + s = "OpenCL"; + v.reserve(devices.size()); + + for (size_t i = 0; i < devices.size(); i++) + { + vector> tempDevices{ devices[i] }; + auto renderer = unique_ptr>(new RendererCL(tempDevices, !i ? shared : false, texId)); + + if (!renderer.get() || !renderer->Ok()) + { + ostringstream os; + + if (renderer.get()) + errorReport.AddToReport(renderer->ErrorReport()); + + os << "Error initializing OpenCL renderer for platform " << devices[i].first << ", " << devices[i].second; + errorReport.AddToReport(os.str()); + } + else + v.push_back(std::move(renderer)); + } + } + else + { + s = "CPU"; + v.push_back(std::move(unique_ptr>(::CreateRenderer(CPU_RENDERER, devices, shared, texId, errorReport)))); + } + } + catch (const std::exception& e) + { + errorReport.AddToReport("Error creating " + s + " renderer: " + e.what() + "\n"); + } + catch (...) + { + errorReport.AddToReport("Error creating " + s + " renderer.\n"); + } + + if (v.empty() && s != "CPU")//OpenCL creation failed and CPU creation has not been attempted, so just create one CPU renderer and place it in the vector. + { + try + { + s = "CPU"; + v.push_back(std::move(unique_ptr>(::CreateRenderer(CPU_RENDERER, devices, shared, texId, errorReport)))); + } + catch (const std::exception& e) + { + errorReport.AddToReport("Error creating fallback" + s + " renderer: " + e.what() + "\n"); + } + catch (...) + { + errorReport.AddToReport("Error creating fallback " + s + " renderer.\n"); + } + } + + return v; +} + +/// +/// Perform a render which allows for using strips or not. +/// If an error occurs while rendering any strip, the rendering process stops. +/// +/// The renderer to use +/// The ember to render +/// The vector to place the final output in +/// The time position to use, only valid for animation +/// The number of strips to use. This must be validated before calling this function. +/// True to flip the Y axis, else false. +/// Function called before the start of the rendering of each strip +/// Function called after the end of the rendering of each strip +/// Function called if there is an error rendering a strip +/// Function called when all strips successfully finish rendering +/// True if all rendering was successful, else false. template static bool StripsRender(RendererBase* renderer, Ember& ember, vector& finalImage, double time, size_t strips, bool yAxisUp, std::function perStripStart, @@ -354,6 +474,17 @@ static bool StripsRender(RendererBase* renderer, Ember& ember, vector& return success; } +/// +/// Verify that the specified number of strips is valid for the given height. +/// The passed in error functions will be called if the number of strips needs +/// to be modified for the given height. +/// +/// The height in pixels of the image to be rendered +/// The number of strips to split the render into +/// Function called if the number of strips exceeds the height of the image +/// Function called if the number of strips does not divide evently into the height of the image +/// Called if for any reason the number of strips used will differ from the value passed in +/// The actual number of strips that will be used static size_t VerifyStrips(size_t height, size_t strips, std::function stripError1, std::function stripError2, diff --git a/Source/EmberCommon/EmberCommonPch.h b/Source/EmberCommon/EmberCommonPch.h index 242476e..ccb43b4 100644 --- a/Source/EmberCommon/EmberCommonPch.h +++ b/Source/EmberCommon/EmberCommonPch.h @@ -1,4 +1,6 @@ -#pragma once +#ifdef WIN32 + #pragma once +#endif /// /// Precompiled header file. Place all system includes here with appropriate #defines for different operating systems and compilers. diff --git a/Source/EmberCommon/EmberOptions.h b/Source/EmberCommon/EmberOptions.h index 87e2790..b3d83dd 100644 --- a/Source/EmberCommon/EmberOptions.h +++ b/Source/EmberCommon/EmberOptions.h @@ -56,9 +56,7 @@ enum eOptionIDs OPT_DUMP_KERNEL, //Value args. - OPT_OPENCL_PLATFORM,//Int value args. - OPT_OPENCL_DEVICE, - OPT_SEED, + OPT_SEED,//Int value args. OPT_NTHREADS, OPT_STRIPS, OPT_SUPERSAMPLE, @@ -94,7 +92,8 @@ enum eOptionIDs OPT_USEMEM, OPT_LOOPS, - OPT_ISAAC_SEED,//String value args. + OPT_OPENCL_DEVICE,//String value args. + OPT_ISAAC_SEED, OPT_IN, OPT_OUT, OPT_PREFIX, @@ -158,7 +157,7 @@ public: /// The default value to use the option was not given on the command line /// The format the argument should be given in /// The documentation string describing what the argument means - EmberOptionEntry(eOptionUse optUsage, eOptionIDs optId, const CharT* arg, T defaultVal, ESOArgType argType, string docString) + EmberOptionEntry(eOptionUse optUsage, eOptionIDs optId, const CharT* arg, T defaultVal, ESOArgType argType, const string& docString) { m_OptionUse = optUsage; m_Option.nId = int(optId); @@ -235,25 +234,25 @@ private: break //Int. -#define Eoi EmberOptionEntry +#define Eoi EmberOptionEntry #define INITINTOPTION(member, option) \ member = option; \ m_IntArgs.push_back(&member) #define PARSEINTOPTION(opt, member) \ case (opt): \ - sscanf_s(args.OptionArg(), "%d", &member.m_Val); \ + sscanf_s(args.OptionArg(), "%ld", &member.m_Val); \ break //Uint. -#define Eou EmberOptionEntry +#define Eou EmberOptionEntry #define INITUINTOPTION(member, option) \ member = option; \ m_UintArgs.push_back(&member) #define PARSEUINTOPTION(opt, member) \ case (opt): \ - sscanf_s(args.OptionArg(), "%u", &member.m_Val); \ + sscanf_s(args.OptionArg(), "%lu", &member.m_Val); \ break //Double. @@ -318,7 +317,7 @@ public: INITBOOLOPTION(JpegComments, Eob(OPT_RENDER_ANIM, OPT_JPEG_COMMENTS, _T("--enable_jpeg_comments"), true, SO_OPT, "\t--enable_jpeg_comments Enables comments in the jpeg header [default: true].\n")); INITBOOLOPTION(PngComments, Eob(OPT_RENDER_ANIM, OPT_PNG_COMMENTS, _T("--enable_png_comments"), true, SO_OPT, "\t--enable_png_comments Enables comments in the png header [default: true].\n")); INITBOOLOPTION(WriteGenome, Eob(OPT_USE_ANIMATE, OPT_WRITE_GENOME, _T("--write_genome"), false, SO_NONE, "\t--write_genome Write out flame associated with center of motion blur window [default: false].\n")); - INITBOOLOPTION(ThreadedWrite, Eob(OPT_RENDER_ANIM, OPT_THREADED_WRITE, _T("--threaded_write"), true, SO_OPT, "\t--threaded_write Use a separate thread to write images to disk. This doubles the memory required for the final output buffer. [default: true].\n")); + INITBOOLOPTION(ThreadedWrite, Eob(OPT_RENDER_ANIM, OPT_THREADED_WRITE, _T("--threaded_write"), true, SO_OPT, "\t--threaded_write Use a separate thread to write images to disk. This gives better performance, but doubles the memory required for the final output buffer. [default: true].\n")); INITBOOLOPTION(Enclosed, Eob(OPT_USE_GENOME, OPT_ENCLOSED, _T("--enclosed"), true, SO_OPT, "\t--enclosed Use enclosing XML tags [default: true].\n")); INITBOOLOPTION(NoEdits, Eob(OPT_USE_GENOME, OPT_NO_EDITS, _T("--noedits"), false, SO_NONE, "\t--noedits Exclude edit tags when writing Xml [default: false].\n")); INITBOOLOPTION(UnsmoothEdge, Eob(OPT_USE_GENOME, OPT_UNSMOOTH_EDGE, _T("--unsmoother"), false, SO_NONE, "\t--unsmoother Do not use smooth blending for sheep edges [default: false].\n")); @@ -336,8 +335,6 @@ public: #endif //Uint. - INITUINTOPTION(Platform, Eou(OPT_USE_ALL, OPT_OPENCL_PLATFORM, _T("--platform"), 0, SO_REQ_SEP, "\t--platform The OpenCL platform index to use [default: 0].\n")); - INITUINTOPTION(Device, Eou(OPT_USE_ALL, OPT_OPENCL_DEVICE, _T("--device"), 0, SO_REQ_SEP, "\t--device The OpenCL device index within the specified platform to use [default: 0].\n")); INITUINTOPTION(Seed, Eou(OPT_USE_ALL, OPT_SEED, _T("--seed"), 0, SO_REQ_SEP, "\t--seed= Integer seed to use for the random number generator [default: random].\n")); INITUINTOPTION(ThreadCount, Eou(OPT_USE_ALL, OPT_NTHREADS, _T("--nthreads"), 0, SO_REQ_SEP, "\t--nthreads= The number of threads to use [default: use all available cores].\n")); INITUINTOPTION(Strips, Eou(OPT_USE_RENDER, OPT_STRIPS, _T("--nstrips"), 1, SO_REQ_SEP, "\t--nstrips= The number of fractions to split a single render frame into. Useful for print size renders or low memory systems [default: 1].\n")); @@ -378,9 +375,10 @@ public: INITDOUBLEOPTION(Loops, Eod(OPT_USE_GENOME, OPT_LOOPS, _T("--loops"), 1.0, SO_REQ_SEP, "\t--loops= Number of times to rotate each control point in sequence [default: 1].\n")); //String. + INITSTRINGOPTION(Device, Eos(OPT_USE_ALL, OPT_OPENCL_DEVICE, _T("--device"), "0", SO_REQ_SEP, "\t--device The comma-separated OpenCL device indices to use. Single device: 0 Multi device: 0,1,3,4 [default: 0].\n")); INITSTRINGOPTION(IsaacSeed, Eos(OPT_USE_ALL, OPT_ISAAC_SEED, _T("--isaac_seed"), "", SO_REQ_SEP, "\t--isaac_seed= Character-based seed for the random number generator [default: random].\n")); INITSTRINGOPTION(Input, Eos(OPT_RENDER_ANIM, OPT_IN, _T("--in"), "", SO_REQ_SEP, "\t--in= Name of the input file.\n")); - INITSTRINGOPTION(Out, Eos(OPT_RENDER_ANIM, OPT_OUT, _T("--out"), "", SO_REQ_SEP, "\t--out= Name of a single output file. Not recommended when rendering more than one image.\n")); + INITSTRINGOPTION(Out, Eos(OPT_USE_RENDER, OPT_OUT, _T("--out"), "", SO_REQ_SEP, "\t--out= Name of a single output file. Not recommended when rendering more than one image.\n")); INITSTRINGOPTION(Prefix, Eos(OPT_RENDER_ANIM, OPT_PREFIX, _T("--prefix"), "", SO_REQ_SEP, "\t--prefix= Prefix to prepend to all output files.\n")); INITSTRINGOPTION(Suffix, Eos(OPT_RENDER_ANIM, OPT_SUFFIX, _T("--suffix"), "", SO_REQ_SEP, "\t--suffix= Suffix to append to all output files.\n")); INITSTRINGOPTION(Format, Eos(OPT_RENDER_ANIM, OPT_FORMAT, _T("--format"), "png", SO_REQ_SEP, "\t--format= Format of the output file. Valid values are: bmp, jpg, png, ppm [default: jpg].\n")); @@ -470,9 +468,7 @@ public: PARSEINTOPTION(OPT_SHEEP_GEN, SheepGen); PARSEINTOPTION(OPT_SHEEP_ID, SheepId); PARSEINTOPTION(OPT_PRIORITY, Priority); - PARSEUINTOPTION(OPT_OPENCL_PLATFORM, Platform);//uint args. - PARSEUINTOPTION(OPT_OPENCL_DEVICE, Device); - PARSEUINTOPTION(OPT_SEED, Seed); + PARSEUINTOPTION(OPT_SEED, Seed);//uint args. PARSEUINTOPTION(OPT_NTHREADS, ThreadCount); PARSEUINTOPTION(OPT_STRIPS, Strips); PARSEUINTOPTION(OPT_SUPERSAMPLE, Supersample); @@ -504,7 +500,8 @@ public: PARSEDOUBLEOPTION(OPT_USEMEM, UseMem); PARSEDOUBLEOPTION(OPT_LOOPS, Loops); - PARSESTRINGOPTION(OPT_ISAAC_SEED, IsaacSeed);//String args. + PARSESTRINGOPTION(OPT_OPENCL_DEVICE, Device);//String args. + PARSESTRINGOPTION(OPT_ISAAC_SEED, IsaacSeed); PARSESTRINGOPTION(OPT_IN, Input); PARSESTRINGOPTION(OPT_OUT, Out); PARSESTRINGOPTION(OPT_PREFIX, Prefix); @@ -545,9 +542,36 @@ public: } } + auto strings = Split(Device(), ','); + + if (!strings.empty()) + { + for (auto& s : strings) + { + size_t device = 0; + istringstream istr(s); + + istr >> device; + + if (!istr.bad() && !istr.fail()) + m_Devices.push_back(device); + else + cout << "Failed to parse device index " << s; + } + } + return false; } + /// + /// Return a const ref to m_Devices. + /// + /// A const ref to the vector of absolute device indices to be used + const vector& Devices() + { + return m_Devices; + } + /// /// Return a vector of all available options for the specified program. /// @@ -656,103 +680,103 @@ public: //Break from the usual m_* notation for members here because //each of these is a functor, so it looks nicer and is less typing //to just say opt.Member(). - EmberOptionEntry Help;//Diagnostic bool. - EmberOptionEntry Version; - EmberOptionEntry Verbose; - EmberOptionEntry Debug; - EmberOptionEntry DumpArgs; - EmberOptionEntry DoProgress; - EmberOptionEntry OpenCLInfo; + Eob Help;//Diagnostic bool. + Eob Version; + Eob Verbose; + Eob Debug; + Eob DumpArgs; + Eob DoProgress; + Eob OpenCLInfo; - EmberOptionEntry EmberCL;//Value bool. - EmberOptionEntry EarlyClip; - EmberOptionEntry YAxisUp; - EmberOptionEntry Transparency; - EmberOptionEntry NameEnable; - EmberOptionEntry IntPalette; - EmberOptionEntry HexPalette; - EmberOptionEntry InsertPalette; - EmberOptionEntry JpegComments; - EmberOptionEntry PngComments; - EmberOptionEntry WriteGenome; - EmberOptionEntry ThreadedWrite; - EmberOptionEntry Enclosed; - EmberOptionEntry NoEdits; - EmberOptionEntry UnsmoothEdge; - EmberOptionEntry LockAccum; - EmberOptionEntry DumpKernel; + Eob EmberCL;//Value bool. + Eob EarlyClip; + Eob YAxisUp; + Eob Transparency; + Eob NameEnable; + Eob IntPalette; + Eob HexPalette; + Eob InsertPalette; + Eob JpegComments; + Eob PngComments; + Eob WriteGenome; + Eob ThreadedWrite; + Eob Enclosed; + Eob NoEdits; + Eob UnsmoothEdge; + Eob LockAccum; + Eob DumpKernel; - EmberOptionEntry Symmetry;//Value int. - EmberOptionEntry SheepGen; - EmberOptionEntry SheepId; - EmberOptionEntry Priority; - EmberOptionEntry Platform;//Value uint. - EmberOptionEntry Device; - EmberOptionEntry Seed; - EmberOptionEntry ThreadCount; - EmberOptionEntry Strips; - EmberOptionEntry Supersample; - EmberOptionEntry BitsPerChannel; - EmberOptionEntry SubBatchSize; - EmberOptionEntry Bits; - EmberOptionEntry PrintEditDepth; - EmberOptionEntry JpegQuality; - EmberOptionEntry FirstFrame; - EmberOptionEntry LastFrame; - EmberOptionEntry Frame; - EmberOptionEntry Time; - EmberOptionEntry Dtime; - EmberOptionEntry Frames; - EmberOptionEntry Repeat; - EmberOptionEntry Tries; - EmberOptionEntry MaxXforms; + Eoi Symmetry;//Value int. + Eoi SheepGen; + Eoi SheepId; + Eoi Priority; + Eou Seed;//Value uint. + Eou ThreadCount; + Eou Strips; + Eou Supersample; + Eou BitsPerChannel; + Eou SubBatchSize; + Eou Bits; + Eou PrintEditDepth; + Eou JpegQuality; + Eou FirstFrame; + Eou LastFrame; + Eou Frame; + Eou Time; + Eou Dtime; + Eou Frames; + Eou Repeat; + Eou Tries; + Eou MaxXforms; - EmberOptionEntry SizeScale;//Value double. - EmberOptionEntry QualityScale; - EmberOptionEntry AspectRatio; - EmberOptionEntry Stagger; - EmberOptionEntry AvgThresh; - EmberOptionEntry BlackThresh; - EmberOptionEntry WhiteLimit; - EmberOptionEntry Speed; - EmberOptionEntry OffsetX; - EmberOptionEntry OffsetY; - EmberOptionEntry UseMem; - EmberOptionEntry Loops; + Eod SizeScale;//Value double. + Eod QualityScale; + Eod AspectRatio; + Eod Stagger; + Eod AvgThresh; + Eod BlackThresh; + Eod WhiteLimit; + Eod Speed; + Eod OffsetX; + Eod OffsetY; + Eod UseMem; + Eod Loops; - EmberOptionEntry IsaacSeed;//Value string. - EmberOptionEntry Input; - EmberOptionEntry Out; - EmberOptionEntry Prefix; - EmberOptionEntry Suffix; - EmberOptionEntry Format; - EmberOptionEntry PalettePath; - //EmberOptionEntry PaletteImage; - EmberOptionEntry Id; - EmberOptionEntry Url; - EmberOptionEntry Nick; - EmberOptionEntry Comment; - EmberOptionEntry TemplateFile; - EmberOptionEntry Clone; - EmberOptionEntry CloneAll; - EmberOptionEntry CloneAction; - EmberOptionEntry Animate; - EmberOptionEntry Mutate; - EmberOptionEntry Cross0; - EmberOptionEntry Cross1; - EmberOptionEntry Method; - EmberOptionEntry Inter; - EmberOptionEntry Rotate; - EmberOptionEntry Strip; - EmberOptionEntry Sequence; - EmberOptionEntry UseVars; - EmberOptionEntry DontUseVars; - EmberOptionEntry Extras; + Eos Device;//Value string. + Eos IsaacSeed; + Eos Input; + Eos Out; + Eos Prefix; + Eos Suffix; + Eos Format; + Eos PalettePath; + //Eos PaletteImage; + Eos Id; + Eos Url; + Eos Nick; + Eos Comment; + Eos TemplateFile; + Eos Clone; + Eos CloneAll; + Eos CloneAction; + Eos Animate; + Eos Mutate; + Eos Cross0; + Eos Cross1; + Eos Method; + Eos Inter; + Eos Rotate; + Eos Strip; + Eos Sequence; + Eos UseVars; + Eos DontUseVars; + Eos Extras; private: - vector*> m_BoolArgs; - vector*> m_IntArgs; - vector*> m_UintArgs; - vector*> m_DoubleArgs; - vector*> m_StringArgs; + vector m_Devices; + vector m_BoolArgs; + vector m_IntArgs; + vector m_UintArgs; + vector m_DoubleArgs; + vector m_StringArgs; }; diff --git a/Source/EmberCommon/JpegUtils.h b/Source/EmberCommon/JpegUtils.h index 5fcc05c..580c438 100644 --- a/Source/EmberCommon/JpegUtils.h +++ b/Source/EmberCommon/JpegUtils.h @@ -44,7 +44,7 @@ static bool WritePpm(const char* filename, byte* image, size_t width, size_t hei /// Url of the author /// Nickname of the author /// True if success, else false -static bool WriteJpeg(const char* filename, byte* image, size_t width, size_t height, int quality, bool enableComments, EmberImageComments& comments, string id, string url, string nick) +static bool WriteJpeg(const char* filename, byte* image, size_t width, size_t height, int quality, bool enableComments, const EmberImageComments& comments, const string& id, const string& url, const string& nick) { bool b = false; FILE* file; @@ -135,7 +135,7 @@ static bool WriteJpeg(const char* filename, byte* image, size_t width, size_t he /// Url of the author /// Nickname of the author /// True if success, else false -static bool WritePng(const char* filename, byte* image, size_t width, size_t height, size_t bytesPerChannel, bool enableComments, EmberImageComments& comments, string id, string url, string nick) +static bool WritePng(const char* filename, byte* image, size_t width, size_t height, size_t bytesPerChannel, bool enableComments, const EmberImageComments& comments, const string& id, const string& url, const string& nick) { bool b = false; FILE* file; diff --git a/Source/EmberCommon/SimpleGlob.h b/Source/EmberCommon/SimpleGlob.h index f3352f5..2885da8 100644 --- a/Source/EmberCommon/SimpleGlob.h +++ b/Source/EmberCommon/SimpleGlob.h @@ -601,7 +601,7 @@ private: CSimpleGlobTempl(const CSimpleGlobTempl &); // disabled CSimpleGlobTempl & operator=(const CSimpleGlobTempl &); // disabled - /*! @brief The argv array has it's members stored as either an offset into + /*! @brief The argv array has its members stored as either an offset into the string buffer, or as pointers to their string in the buffer. The offsets are used because if the string buffer is dynamically resized, all pointers into that buffer would become invalid. diff --git a/Source/EmberGenome/EmberGenome.cpp b/Source/EmberGenome/EmberGenome.cpp index 21f7612..09d8077 100644 --- a/Source/EmberGenome/EmberGenome.cpp +++ b/Source/EmberGenome/EmberGenome.cpp @@ -1,4 +1,5 @@ #include "EmberCommonPch.h" + #include "EmberGenome.h" #include "JpegUtils.h" @@ -39,10 +40,11 @@ void SetDefaultTestValues(Ember& ember) /// /// A populated EmberOptions object which specifies all program options to be used /// True if success, else false. -template +template bool EmberGenome(EmberOptions& opt) { - OpenCLWrapper wrapper; + OpenCLInfo& info(OpenCLInfo::Instance()); + std::cout.imbue(std::locale("")); if (opt.DumpArgs()) @@ -51,15 +53,15 @@ bool EmberGenome(EmberOptions& opt) if (opt.OpenCLInfo()) { cerr << "\nOpenCL Info: " << endl; - cerr << wrapper.DumpInfo(); + cerr << info.DumpInfo(); return true; } //Regular variables. Timing t; bool exactTimeMatch, randomMode, didColor, seqFlag; - uint i, j, i0, i1, rep, val, frame, frameCount, count = 0; - uint ftime, firstFrame, lastFrame; + size_t i, j, i0, i1, rep, val, frame, frameCount, count = 0; + size_t ftime, firstFrame, lastFrame; size_t n, tot, totb, totw; T avgPix, fractionBlack, fractionWhite, blend, spread, mix0, mix1; string token, filename; @@ -76,8 +78,9 @@ bool EmberGenome(EmberOptions& opt) EmberToXml emberToXml; VariationList varList; EmberReport emberReport, emberReport2; + const vector> devices = Devices(opt.Devices()); unique_ptr> progress(new RenderProgress()); - unique_ptr> renderer(CreateRenderer(opt.EmberCL() ? OPENCL_RENDERER : CPU_RENDERER, opt.Platform(), opt.Device(), false, 0, emberReport)); + unique_ptr> renderer(CreateRenderer(opt.EmberCL() ? OPENCL_RENDERER : CPU_RENDERER, devices, false, 0, emberReport)); QTIsaac rand(ISAAC_INT(t.Tic()), ISAAC_INT(t.Tic() * 2), ISAAC_INT(t.Tic() * 3)); vector errorReport = emberReport.ErrorReport(); @@ -107,13 +110,16 @@ bool EmberGenome(EmberOptions& opt) if (opt.Verbose()) { - cerr << "Platform: " << wrapper.PlatformName(opt.Platform()) << endl; - cerr << "Device: " << wrapper.DeviceName(opt.Platform(), opt.Device()) << endl; + for (auto& device : devices) + { + cerr << "Platform: " << info.PlatformName(device.first) << endl; + cerr << "Device: " << info.DeviceName(device.first, device.second) << endl; + } } } //SheepTools will own the created renderer and will take care of cleaning it up. - SheepTools tools(opt.PalettePath(), CreateRenderer(opt.EmberCL() ? OPENCL_RENDERER : CPU_RENDERER, opt.Platform(), opt.Device(), false, 0, emberReport2)); + SheepTools tools(opt.PalettePath(), CreateRenderer(opt.EmberCL() ? OPENCL_RENDERER : CPU_RENDERER, devices, false, 0, emberReport2)); tools.SetSpinParams(!opt.UnsmoothEdge(), T(opt.Stagger()), @@ -169,7 +175,7 @@ bool EmberGenome(EmberOptions& opt) while (std::getline(iss, token, ',')) { - if (parser.Atoi(token.c_str(), val)) + if (parser.Aton(token.c_str(), val)) { if (val < varList.Size()) vars.push_back(static_cast(val)); @@ -182,7 +188,7 @@ bool EmberGenome(EmberOptions& opt) while (std::getline(iss, token, ',')) { - if (parser.Atoi(token.c_str(), val)) + if (parser.Aton(token.c_str(), val)) { if (val < varList.Size()) noVars.push_back(static_cast(val)); @@ -787,18 +793,18 @@ int _tmain(int argc, _TCHAR* argv[]) #ifdef DO_DOUBLE if (opt.Bits() == 64) { - b = EmberGenome(opt); + b = EmberGenome(opt); } else #endif if (opt.Bits() == 33) { - b = EmberGenome(opt); + b = EmberGenome(opt); } else if (opt.Bits() == 32) { cerr << "Bits 32/int histogram no longer supported. Using bits == 33 (float)." << endl; - b = EmberGenome(opt); + b = EmberGenome(opt); } } diff --git a/Source/EmberRender/EmberRender.cpp b/Source/EmberRender/EmberRender.cpp index e189864..582aa37 100644 --- a/Source/EmberRender/EmberRender.cpp +++ b/Source/EmberRender/EmberRender.cpp @@ -8,10 +8,10 @@ /// /// A populated EmberOptions object which specifies all program options to be used /// True if success, else false. -template +template bool EmberRender(EmberOptions& opt) { - OpenCLWrapper wrapper; + EmberCLns::OpenCLInfo& info(EmberCLns::OpenCLInfo::Instance()); std::cout.imbue(std::locale("")); @@ -21,7 +21,7 @@ bool EmberRender(EmberOptions& opt) if (opt.OpenCLInfo()) { cout << "\nOpenCL Info: " << endl; - cout << wrapper.DumpInfo(); + cout << info.DumpInfo(); return true; } @@ -44,8 +44,9 @@ bool EmberRender(EmberOptions& opt) XmlToEmber parser; EmberToXml emberToXml; vector> randVec; + const vector> devices = Devices(opt.Devices()); unique_ptr> progress(new RenderProgress()); - unique_ptr> renderer(CreateRenderer(opt.EmberCL() ? OPENCL_RENDERER : CPU_RENDERER, opt.Platform(), opt.Device(), false, 0, emberReport)); + unique_ptr> renderer(CreateRenderer(opt.EmberCL() ? OPENCL_RENDERER : CPU_RENDERER, devices, false, 0, emberReport)); vector errorReport = emberReport.ErrorReport(); if (!errorReport.empty()) @@ -86,8 +87,11 @@ bool EmberRender(EmberOptions& opt) if (opt.Verbose()) { - cout << "Platform: " << wrapper.PlatformName(opt.Platform()) << endl; - cout << "Device: " << wrapper.DeviceName(opt.Platform(), opt.Device()) << endl; + for (auto& device : devices) + { + cout << "Platform: " << info.PlatformName(device.first) << endl; + cout << "Device: " << info.DeviceName(device.first, device.second) << endl; + } } if (opt.ThreadCount() > 1) @@ -145,7 +149,7 @@ bool EmberRender(EmberOptions& opt) //Final setup steps before running. os.imbue(std::locale("")); - padding = uint(log10((double)embers.size())) + 1; + padding = uint(log10(double(embers.size()))) + 1; renderer->EarlyClip(opt.EarlyClip()); renderer->YAxisUp(opt.YAxisUp()); renderer->LockAccum(opt.LockAccum()); @@ -154,7 +158,7 @@ bool EmberRender(EmberOptions& opt) renderer->Transparency(opt.Transparency()); renderer->NumChannels(channels); renderer->BytesPerChannel(opt.BitsPerChannel() / 8); - renderer->Priority((eThreadPriority)Clamp((int)eThreadPriority::LOWEST, (int)eThreadPriority::HIGHEST, opt.Priority())); + renderer->Priority(eThreadPriority(Clamp(int(opt.Priority()), int(eThreadPriority::LOWEST), int(eThreadPriority::HIGHEST)))); renderer->Callback(opt.DoProgress() ? progress.get() : nullptr); for (i = 0; i < embers.size(); i++) @@ -276,7 +280,7 @@ bool EmberRender(EmberOptions& opt) os << comments.m_NumIters << " / " << iterCount << " (" << std::fixed << std::setprecision(2) << ((double(stats.m_Iters) / double(iterCount)) * 100) << "%)"; VerbosePrint("\nIters ran/requested: " + os.str()); - VerbosePrint("Bad values: " << stats.m_Badvals); + if (!opt.EmberCL()) VerbosePrint("Bad values: " << stats.m_Badvals); VerbosePrint("Render time: " + t.Format(stats.m_RenderMs)); VerbosePrint("Pure iter time: " + t.Format(stats.m_IterMs)); VerbosePrint("Iters/sec: " << size_t(stats.m_Iters / (stats.m_IterMs / 1000.0)) << endl); @@ -291,7 +295,7 @@ bool EmberRender(EmberOptions& opt) if (opt.Format() == "png") writeSuccess = WritePng(filename.c_str(), finalImagep, finalEmber.m_FinalRasW, finalEmber.m_FinalRasH, opt.BitsPerChannel() / 8, opt.PngComments(), comments, opt.Id(), opt.Url(), opt.Nick()); else if (opt.Format() == "jpg") - writeSuccess = WriteJpeg(filename.c_str(), finalImagep, finalEmber.m_FinalRasW, finalEmber.m_FinalRasH, opt.JpegQuality(), opt.JpegComments(), comments, opt.Id(), opt.Url(), opt.Nick()); + writeSuccess = WriteJpeg(filename.c_str(), finalImagep, finalEmber.m_FinalRasW, finalEmber.m_FinalRasH, int(opt.JpegQuality()), opt.JpegComments(), comments, opt.Id(), opt.Url(), opt.Nick()); else if (opt.Format() == "ppm") writeSuccess = WritePpm(filename.c_str(), finalImagep, finalEmber.m_FinalRasW, finalEmber.m_FinalRasH); else if (opt.Format() == "bmp") @@ -303,7 +307,7 @@ bool EmberRender(EmberOptions& opt) if (opt.EmberCL() && opt.DumpKernel()) { - if (auto rendererCL = dynamic_cast*>(renderer.get())) + if (auto rendererCL = dynamic_cast*>(renderer.get())) { cout << "Iteration kernel: \n" << rendererCL->IterKernel() << "\n\n" << @@ -315,8 +319,7 @@ bool EmberRender(EmberOptions& opt) VerbosePrint("Done."); } - if (opt.Verbose()) - t.Toc("\nTotal time: ", true); + t.Toc("\nFinished in: ", true); return true; } @@ -347,18 +350,18 @@ int _tmain(int argc, _TCHAR* argv[]) #ifdef DO_DOUBLE if (opt.Bits() == 64) { - b = EmberRender(opt); + b = EmberRender(opt); } else #endif if (opt.Bits() == 33) { - b = EmberRender(opt); + b = EmberRender(opt); } else if (opt.Bits() == 32) { cout << "Bits 32/int histogram no longer supported. Using bits == 33 (float)." << endl; - b = EmberRender(opt); + b = EmberRender(opt); } } diff --git a/Source/EmberTester/EmberTester.cpp b/Source/EmberTester/EmberTester.cpp index 21ba90b..dd8bd82 100644 --- a/Source/EmberTester/EmberTester.cpp +++ b/Source/EmberTester/EmberTester.cpp @@ -714,7 +714,7 @@ bool TestParVars() names.reserve(parVar->ParamCount()); addresses.reserve(parVar->ParamCount()); - for (uint j = 0; j < parVar->ParamCount(); j++) + for (size_t j = 0; j < parVar->ParamCount(); j++) { if (std::find(names.begin(), names.end(), params[j].Name()) != names.end()) { @@ -1449,13 +1449,13 @@ void TestVarsSimilar() if (parVar) { - for (uint v = 0; v < parVar->ParamCount(); v++) + for (size_t v = 0; v < parVar->ParamCount(); v++) parVar->SetParamVal(v, (T)iter); } if (parVarComp) { - for (uint v = 0; v < parVarComp->ParamCount(); v++) + for (size_t v = 0; v < parVarComp->ParamCount(); v++) parVarComp->SetParamVal(v, (T)iter); } @@ -1878,6 +1878,46 @@ double RandD(QTIsaac& rand) // double points[4]; //}; +void TestThreadedKernel() +{ + OpenCLWrapper wrapper1, wrapper2; + + if (wrapper1.Init(1, 0) && wrapper2.Init(2, 0)) + { + string k = ConstantDefinesString(false) + "\n__kernel void Kern()\n" + "{\n" + " int gid = GLOBAL_ID_X + GLOBAL_ID_Y;\n" + "}\n" + "\n"; + + if (wrapper1.AddProgram("prog1", k, "Kern", false) && + wrapper2.AddProgram("prog1", k, "Kern", false)) + { + cout << "Builds ok, now run..." << endl; + + std::thread th1([&]() + { + if (wrapper1.RunKernel(0, 256, 16, 1, 16, 16, 1)) + { + cout << "Successful run inside thread 1..." << endl; + } + }); + + std::thread th2([&]() + { + if (wrapper2.RunKernel(0, 256, 16, 1, 16, 16, 1)) + { + cout << "Successful run inside thread 2..." << endl; + } + }); + + th1.join(); + th2.join(); + cout << "Successful join of kernel thread..." << endl; + } + } +} + int _tmain(int argc, _TCHAR* argv[]) { //int i; @@ -1885,12 +1925,14 @@ int _tmain(int argc, _TCHAR* argv[]) QTIsaac rand(1, 2, 3); mt19937 meow(1729); - PaletteList palf; + TestThreadedKernel(); + + /*PaletteList palf; Palette* pal = palf.GetRandomPalette(); cout << pal->Size() << endl; - /*double d = 1; + double d = 1; for (int i = 0; i < 10; i++) { diff --git a/Source/Fractorium/AboutDialog.ui b/Source/Fractorium/AboutDialog.ui index 5384bf2..20ef315 100644 --- a/Source/Fractorium/AboutDialog.ui +++ b/Source/Fractorium/AboutDialog.ui @@ -6,12 +6,12 @@ 0 0 - 488 - 595 + 596 + 622 - + 0 0 @@ -31,223 +31,248 @@ About - - - 6 + + false + + + + + 6 + 5 + 583 + 151 + - - 6 + + + 0 + 0 + - - 6 + + + 12 + - - 6 + + QFrame::NoFrame - - - - - 0 - 0 - - - - - 12 - - - - <html><head/><body><p align="center"><br/>Fractorium 0.4.1.9 Beta</p><p align="center"><span style=" font-size:10pt;"><br/>A Qt-based fractal flame editor which uses a C++ re-write of the flam3 algorithm named Ember and a GPU capable version named EmberCL which implements a portion of the cuburn algorithm in OpenCL.</span></p><p align="center"><span style=" font-size:10pt;">Lead: Matt Feemster</span></p><p align="center"><span style=" font-size:10pt;">Contributors: Simon Detheridge</span></p></body></html> - - - Qt::RichText - - - false - - - Qt::AlignCenter - - - true - - - - - - - - 0 - 0 - - - - - 0 - 0 - - - - Code Copied - - - - 4 + + <html><head/><body><p align="center"><br/>Fractorium 0.4.1.9 Beta</p><p align="center"><span style=" font-size:10pt;">A Qt-based fractal flame editor which uses a C++ re-write of the flam3 algorithm named Ember and a GPU capable version named EmberCL which implements a portion of the cuburn algorithm in OpenCL.</span></p><p align="center"><span style=" font-size:10pt;">Lead: Matt Feemster<br/>Contributors: Simon Detheridge</span></p></body></html> + + + Qt::RichText + + + false + + + Qt::AlignCenter + + + true + + + 1 + + + + + + 5 + 156 + 585 + 458 + + + + + + + + 0 + 0 + - + + + 0 + 0 + + + + Code Copied + + + + 4 + + + 6 + + + + + + 0 + 0 + + + + QFrame::NoFrame + + + <html><head/><body><p><a href="http://code.google.com/p/flam3"><span style=" text-decoration: underline; color:#0000ff;">flam3</span></a>: Scott Draves, Erik Reckase (GPL v2)<br/><a href="http://github.com/stevenrobertson/cuburn"><span style=" text-decoration: underline; color:#0000ff;">cuburn</span></a>: Steven Robertson, Michael Semeniuk, Matthew Znoj, Nicolas Mejia (GPL v3)<br/><a href="http://fractron9000.sourceforge.net"><span style=" text-decoration: underline; color:#0000ff;">Fractron 9000</span></a>: Mike Thiesen (GPL)<br/><a href="http://sourceforge.net/projects/apophysis7x"><span style=" text-decoration: underline; color:#0000ff;">Apophysis</span></a>: Mark Townsend, Ronald Hordijk, Peter Sdobnov, Piotr Borys, Georg Kiehne (GPL)<br/><a href="http://jwildfire.org/"><span style=" text-decoration: underline; color:#0000ff;">JWildfire</span></a>: Andreas Maschke (LGPL)<br/>Numerous Apophysis plugin developers (GPL)</p></body></html> + + + Qt::RichText + + + 1 + + + true + + + Qt::LinksAccessibleByKeyboard|Qt::LinksAccessibleByMouse|Qt::TextBrowserInteraction|Qt::TextSelectableByKeyboard|Qt::TextSelectableByMouse + + + + + + + + + + + 0 + 0 + + + + + 0 + 0 + + + + Libraries Linked + + + + 4 + + + 6 + + + + + + 0 + 0 + + + + QFrame::NoFrame + + + <html><head/><body><p><a href="http://qt-project.org"><span style=" text-decoration: underline; color:#0000ff;">Qt</span></a>: Digia Plc (GPL v3, LGPL v2)<br/><a href="http://g-truc.net"><span style=" text-decoration: underline; color:#0000ff;">glm</span></a>: Christophe Riccio (MIT License)<br/><a href="http://threadingbuildingblocks.org"><span style=" text-decoration: underline; color:#0000ff;">Threading Building Blocks</span></a>: Intel Corporation (GPLv2)<br/><a href="http://libjpeg.sourceforge.net"><span style=" text-decoration: underline; color:#0000ff;">libjpeg</span></a>: Independent JPEG Group (Free Software License)<br/><a href="http://libpng.org"><span style=" text-decoration: underline; color:#0000ff;">libpng</span></a>: Glenn Randers-Pehrson et al (Libpng License)<br/><a href="http://xmlsoft.org"><span style=" text-decoration: underline; color:#0000ff;">libxml2</span></a>: Daniel Veillard (MIT License)<br/><a href="http://zlib.net"><span style=" text-decoration: underline; color:#0000ff;">zlib</span></a>: Jean-loup Gailly, Mark Adler (Zlib License)<br/><a href="http://burtleburtle.net/bob/rand/isaac.html"><span style=" text-decoration: underline; color:#0000ff;">QTIsaac</span></a>: Robert J. Jenkins, Quinn Tyler Jackson (Public Domain)<br/><a href="http://cas.ee.ic.ac.uk/people/dt10/index.html"><span style=" text-decoration: underline; color:#0000ff;">MWC64X Random Number Generator</span></a>: David Thomas (Public Domain)<br/><a href="http://code.jellycan.com/simpleopt/"><span style=" text-decoration: underline; color:#0000ff;">SimpleOpt</span></a>: Brodie Thiesfield (MIT License)</p></body></html> + + + Qt::RichText + + + 1 + + + true + + + Qt::LinksAccessibleByKeyboard|Qt::LinksAccessibleByMouse|Qt::TextBrowserInteraction|Qt::TextSelectableByKeyboard|Qt::TextSelectableByMouse + + + + + + + + + + + 0 + 0 + + + + Icons Used + + + + 4 + + + 6 + + + + + + 0 + 0 + + + + QFrame::NoFrame + + + <html><head/><body><p><a href="http://famfamfam.com"><span style=" text-decoration: underline; color:#0000ff;">Silk</span></a>: Mark James (Creative Commons Attribution 2.5 License)<br/><a href="http://momentumdesignlab.com"><span style=" text-decoration: underline; color:#0000ff;">Momentum</span></a>: Momentum Design Lab (Creative Commons Attribution-ShareAlike 3.5 License)<br/><a href="http://everaldo.com"><span style=" text-decoration: underline; color:#0000ff;">Crystal Clear</span></a>: Everaldo Coelho (LGPL)<br/><a href="http://openiconlibrary.sourceforge.net"><span style=" text-decoration: underline; color:#0000ff;">Open Icon Library</span></a>: Jeff Israel (GPL, LGPL, Creative Commons, Public Domain)<br/><a href="http://icons.mysitemyway.com/category/3d-transparent-glass-icons/"><span style=" text-decoration: underline; color:#0000ff;">3D Transparent Glass</span></a>: iconsETC (Public Domain)<br/><a href="http://p.yusukekamiyamane.com"><span style=" text-decoration: underline; color:#0000ff;">Fugue</span></a>: Yusuke Kamiyamane (Creative Commons Attribution 3.0 License)</p></body></html> + + + Qt::RichText + + + 1 + + + true + + + Qt::LinksAccessibleByKeyboard|Qt::LinksAccessibleByMouse|Qt::TextBrowserInteraction|Qt::TextSelectableByKeyboard|Qt::TextSelectableByMouse + + + + + + + + + 6 - - - - - 0 - 0 - - + + - 0 + 100 0 + + + 100 + 16777215 + + - <html><head/><body><p><a href="http://code.google.com/p/flam3"><span style=" text-decoration: underline; color:#0000ff;">flam3</span></a>: Scott Draves, Erik Reckase (GPL v2)<br/><a href="http://github.com/stevenrobertson/cuburn"><span style=" text-decoration: underline; color:#0000ff;">cuburn</span></a>: Steven Robertson, Michael Semeniuk, Matthew Znoj, Nicolas Mejia (GPL v3)<br/><a href="http://fractron9000.sourceforge.net"><span style=" text-decoration: underline; color:#0000ff;">Fractron 9000</span></a>: Mike Thiesen (GPL)<br/><a href="http://sourceforge.net/projects/apophysis7x"><span style=" text-decoration: underline; color:#0000ff;">Apophysis</span></a>: Mark Townsend, Ronald Hordijk, Peter Sdobnov, Piotr Borys, Georg Kiehne (GPL)<br/><a href="http://jwildfire.org/"><span style=" text-decoration: underline; color:#0000ff;">JWildfire</span></a>: Andreas Maschke (LGPL)<br/>Numerous Apophysis plugin developers (GPL)</p></body></html> - - - Qt::RichText - - - true - - - Qt::LinksAccessibleByKeyboard|Qt::LinksAccessibleByMouse|Qt::TextBrowserInteraction|Qt::TextSelectableByKeyboard|Qt::TextSelectableByMouse + OK - - - - - - - 0 - 0 - - - - - 0 - 0 - - - - Libraries Linked - - - - 4 - - - 6 - - - - - - 0 - 0 - - - - <html><head/><body><p><a href="http://qt-project.org"><span style=" text-decoration: underline; color:#0000ff;">Qt</span></a>: Digia Plc (GPL v3, LGPL v2)<br/><a href="http://g-truc.net"><span style=" text-decoration: underline; color:#0000ff;">glm</span></a>: Christophe Riccio (MIT License)<br/><a href="http://threadingbuildingblocks.org"><span style=" text-decoration: underline; color:#0000ff;">Threading Building Blocks</span></a>: Intel Corporation (GPLv2)<br/><a href="http://libjpeg.sourceforge.net"><span style=" text-decoration: underline; color:#0000ff;">libjpeg</span></a>: Independent JPEG Group (Free Software License)<br/><a href="http://libpng.org"><span style=" text-decoration: underline; color:#0000ff;">libpng</span></a>: Glenn Randers-Pehrson et al (Libpng License)<br/><a href="http://xmlsoft.org"><span style=" text-decoration: underline; color:#0000ff;">libxml2</span></a>: Daniel Veillard (MIT License)<br/><a href="http://zlib.net"><span style=" text-decoration: underline; color:#0000ff;">zlib</span></a>: Jean-loup Gailly, Mark Adler (Zlib License)<br/><a href="http://burtleburtle.net/bob/rand/isaac.html"><span style=" text-decoration: underline; color:#0000ff;">QTIsaac</span></a>: Robert J. Jenkins, Quinn Tyler Jackson (Public Domain)<br/><a href="http://cas.ee.ic.ac.uk/people/dt10/index.html"><span style=" text-decoration: underline; color:#0000ff;">MWC64X Random Number Generator</span></a>: David Thomas (Public Domain)<br/><a href="http://code.jellycan.com/simpleopt/"><span style=" text-decoration: underline; color:#0000ff;">SimpleOpt</span></a>: Brodie Thiesfield (MIT License)</p></body></html> - - - Qt::RichText - - - true - - - Qt::LinksAccessibleByKeyboard|Qt::LinksAccessibleByMouse|Qt::TextBrowserInteraction|Qt::TextSelectableByKeyboard|Qt::TextSelectableByMouse - - - - - - - - - - - 0 - 0 - - - - Icons Used - - - - 4 - - - 6 - - - - - - 0 - 0 - - - - <html><head/><body><p><a href="http://famfamfam.com"><span style=" text-decoration: underline; color:#0000ff;">Silk</span></a>: Mark James (Creative Commons Attribution 2.5 License)<br/><a href="http://momentumdesignlab.com"><span style=" text-decoration: underline; color:#0000ff;">Momentum</span></a>: Momentum Design Lab (Creative Commons Attribution-ShareAlike 3.5 License)<br/><a href="http://everaldo.com"><span style=" text-decoration: underline; color:#0000ff;">Crystal Clear</span></a>: Everaldo Coelho (LGPL)<br/><a href="http://openiconlibrary.sourceforge.net"><span style=" text-decoration: underline; color:#0000ff;">Open Icon Library</span></a>: Jeff Israel (GPL, LGPL, Creative Commons, Public Domain)<br/><a href="http://icons.mysitemyway.com/category/3d-transparent-glass-icons/"><span style=" text-decoration: underline; color:#0000ff;">3D Transparent Glass</span></a>: iconsETC (Public Domain)<br/><a href="http://p.yusukekamiyamane.com"><span style=" text-decoration: underline; color:#0000ff;">Fugue</span></a>: Yusuke Kamiyamane (Creative Commons Attribution 3.0 License)</p></body></html> - - - Qt::RichText - - - true - - - Qt::LinksAccessibleByKeyboard|Qt::LinksAccessibleByMouse|Qt::TextBrowserInteraction|Qt::TextSelectableByKeyboard|Qt::TextSelectableByMouse - - - - - - - - - - 6 - - - - - - 100 - 0 - - - - - 100 - 16777215 - - - - OK - - - - - - + + + diff --git a/Source/Fractorium/CurvesGraphicsView.cpp b/Source/Fractorium/CurvesGraphicsView.cpp index b237aaa..c923f8b 100644 --- a/Source/Fractorium/CurvesGraphicsView.cpp +++ b/Source/Fractorium/CurvesGraphicsView.cpp @@ -124,7 +124,7 @@ void CurvesGraphicsView::Set(int curveIndex, int pointIndex, const QPointF& poin /// The curve to set void CurvesGraphicsView::SetTop(CurveIndex curveIndex) { - int index; + size_t index; switch (curveIndex) { @@ -142,7 +142,7 @@ void CurvesGraphicsView::SetTop(CurveIndex curveIndex) index = 3; } - for (int i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) { if (i == index) { diff --git a/Source/Fractorium/DoubleSpinBox.h b/Source/Fractorium/DoubleSpinBox.h index 305bf00..15d4ac2 100644 --- a/Source/Fractorium/DoubleSpinBox.h +++ b/Source/Fractorium/DoubleSpinBox.h @@ -76,7 +76,7 @@ public: /// The name of the parameter this is for /// The height of the spin box. Default: 16. /// The step used to increment/decrement the spin box when using the mouse wheel. Default: 0.05. - explicit VariationTreeDoubleSpinBox(QWidget* p, VariationTreeWidgetItem* widgetItem, eVariationId id, string param, int h = 16, double step = 0.05) + explicit VariationTreeDoubleSpinBox(QWidget* p, VariationTreeWidgetItem* widgetItem, eVariationId id, const string& param, int h = 16, double step = 0.05) : DoubleSpinBox(p, h, step) { m_WidgetItem = widgetItem; diff --git a/Source/Fractorium/FinalRenderDialog.cpp b/Source/Fractorium/FinalRenderDialog.cpp index 5fa02bd..f22e510 100644 --- a/Source/Fractorium/FinalRenderDialog.cpp +++ b/Source/Fractorium/FinalRenderDialog.cpp @@ -11,7 +11,8 @@ /// The parent widget /// The window flags. Default: 0. FractoriumFinalRenderDialog::FractoriumFinalRenderDialog(FractoriumSettings* settings, QWidget* p, Qt::WindowFlags f) - : QDialog(p, f) + : QDialog(p, f), + m_Info(OpenCLInfo::Instance()) { ui.setupUi(this); @@ -29,18 +30,18 @@ FractoriumFinalRenderDialog::FractoriumFinalRenderDialog(FractoriumSettings* set connect(ui.FinalRenderTransparencyCheckBox, SIGNAL(stateChanged(int)), this, SLOT(OnTransparencyCheckBoxStateChanged(int)), Qt::QueuedConnection); connect(ui.FinalRenderOpenCLCheckBox, SIGNAL(stateChanged(int)), this, SLOT(OnOpenCLCheckBoxStateChanged(int)), Qt::QueuedConnection); connect(ui.FinalRenderDoublePrecisionCheckBox, SIGNAL(stateChanged(int)), this, SLOT(OnDoublePrecisionCheckBoxStateChanged(int)), Qt::QueuedConnection); - connect(ui.FinalRenderPlatformCombo, SIGNAL(currentIndexChanged(int)), this, SLOT(OnPlatformComboCurrentIndexChanged(int)), Qt::QueuedConnection); connect(ui.FinalRenderDoAllCheckBox, SIGNAL(stateChanged(int)), this, SLOT(OnDoAllCheckBoxStateChanged(int)), Qt::QueuedConnection); connect(ui.FinalRenderDoSequenceCheckBox, SIGNAL(stateChanged(int)), this, SLOT(OnDoSequenceCheckBoxStateChanged(int)), Qt::QueuedConnection); - connect(ui.FinalRenderCurrentSpin, SIGNAL(valueChanged(int)), this, SLOT(OnFinalRenderCurrentSpinChanged(int)), Qt::QueuedConnection); + connect(ui.FinalRenderCurrentSpin, SIGNAL(valueChanged(int)), this, SLOT(OnCurrentSpinChanged(int)), Qt::QueuedConnection); connect(ui.FinalRenderApplyToAllCheckBox, SIGNAL(stateChanged(int)), this, SLOT(OnApplyAllCheckBoxStateChanged(int)), Qt::QueuedConnection); connect(ui.FinalRenderKeepAspectCheckBox, SIGNAL(stateChanged(int)), this, SLOT(OnKeepAspectCheckBoxStateChanged(int)), Qt::QueuedConnection); connect(ui.FinalRenderScaleNoneRadioButton, SIGNAL(toggled(bool)), this, SLOT(OnScaleRadioButtonChanged(bool)), Qt::QueuedConnection); connect(ui.FinalRenderScaleWidthRadioButton, SIGNAL(toggled(bool)), this, SLOT(OnScaleRadioButtonChanged(bool)), Qt::QueuedConnection); connect(ui.FinalRenderScaleHeightRadioButton, SIGNAL(toggled(bool)), this, SLOT(OnScaleRadioButtonChanged(bool)), Qt::QueuedConnection); - - SetupSpinner(ui.FinalRenderSizeTable, this, row, 1, m_WidthScaleSpin, spinHeight, 0.001, 99.99, 0.1, SIGNAL(valueChanged(double)), SLOT(OnFinalRenderWidthScaleChanged(double)), true, 1.0, 1.0, 1.0); - SetupSpinner(ui.FinalRenderSizeTable, this, row, 1, m_HeightScaleSpin, spinHeight, 0.001, 99.99, 0.1, SIGNAL(valueChanged(double)), SLOT(OnFinalRenderHeightScaleChanged(double)), true, 1.0, 1.0, 1.0); + connect(ui.DeviceTable, SIGNAL(cellChanged(int, int)), this, SLOT(OnDeviceTableCellChanged(int, int)), Qt::QueuedConnection); + + SetupSpinner(ui.FinalRenderSizeTable, this, row, 1, m_WidthScaleSpin, spinHeight, 0.001, 99.99, 0.1, SIGNAL(valueChanged(double)), SLOT(OnWidthScaleChanged(double)), true, 1.0, 1.0, 1.0); + SetupSpinner(ui.FinalRenderSizeTable, this, row, 1, m_HeightScaleSpin, spinHeight, 0.001, 99.99, 0.1, SIGNAL(valueChanged(double)), SLOT(OnHeightScaleChanged(double)), true, 1.0, 1.0, 1.0); m_WidthScaleSpin->setDecimals(3); m_HeightScaleSpin->setDecimals(3); m_WidthScaleSpin->setSuffix(" ( )"); @@ -68,43 +69,35 @@ FractoriumFinalRenderDialog::FractoriumFinalRenderDialog(FractoriumSettings* set table->item(row++, 1)->setTextAlignment(Qt::AlignRight | Qt::AlignVCenter); connect(m_Tbcw->m_Button1, SIGNAL(clicked(bool)), this, SLOT(OnFileButtonClicked(bool)), Qt::QueuedConnection); connect(m_Tbcw->m_Button2, SIGNAL(clicked(bool)), this, SLOT(OnShowFolderButtonClicked(bool)), Qt::QueuedConnection); - connect(m_Tbcw->m_Combo, SIGNAL(currentIndexChanged(int)), this, SLOT(OnFinalRenderExtIndexChanged(int)), Qt::QueuedConnection); + connect(m_Tbcw->m_Combo, SIGNAL(currentIndexChanged(int)), this, SLOT(OnExtIndexChanged(int)), Qt::QueuedConnection); m_PrefixEdit = new QLineEdit(table); table->setCellWidget(row++, 1, m_PrefixEdit); m_SuffixEdit = new QLineEdit(table); table->setCellWidget(row++, 1, m_SuffixEdit); - connect(m_PrefixEdit, SIGNAL(textChanged(const QString&)), this, SLOT(OnFinalRenderPrefixChanged(const QString&)), Qt::QueuedConnection); - connect(m_SuffixEdit, SIGNAL(textChanged(const QString&)), this, SLOT(OnFinalRenderSuffixChanged(const QString&)), Qt::QueuedConnection); + connect(m_PrefixEdit, SIGNAL(textChanged(const QString&)), this, SLOT(OnPrefixChanged(const QString&)), Qt::QueuedConnection); + connect(m_SuffixEdit, SIGNAL(textChanged(const QString&)), this, SLOT(OnSuffixChanged(const QString&)), Qt::QueuedConnection); ui.StartRenderButton->disconnect(SIGNAL(clicked(bool))); connect(ui.StartRenderButton, SIGNAL(clicked(bool)), this, SLOT(OnRenderClicked(bool)), Qt::QueuedConnection); connect(ui.StopRenderButton, SIGNAL(clicked(bool)), this, SLOT(OnCancelRenderClicked(bool)), Qt::QueuedConnection); - if (m_Wrapper.CheckOpenCL()) + table = ui.DeviceTable; + + if (m_Info.Ok() && !m_Info.Devices().empty()) { - vector platforms = m_Wrapper.PlatformNames(); + SetupDeviceTable(table, m_Settings->FinalDevices()); - //Populate combo boxes with available OpenCL platforms and devices. - for (i = 0; i < platforms.size(); i++) - ui.FinalRenderPlatformCombo->addItem(QString::fromStdString(platforms[i])); + for (int i = 0; i < table->rowCount(); i++) + if (auto radio = qobject_cast(table->cellWidget(i, 1))) + connect(radio, SIGNAL(toggled(bool)), this, SLOT(OnDeviceTableRadioToggled(bool)), Qt::QueuedConnection); - //If init succeeds, set the selected platform and device combos to match what was saved in the settings. - if (m_Wrapper.Init(m_Settings->FinalPlatformIndex(), m_Settings->FinalDeviceIndex())) - { - ui.FinalRenderOpenCLCheckBox->setChecked( m_Settings->FinalOpenCL()); - ui.FinalRenderPlatformCombo->setCurrentIndex(m_Settings->FinalPlatformIndex()); - ui.FinalRenderDeviceCombo->setCurrentIndex( m_Settings->FinalDeviceIndex()); - } - else - { - OnPlatformComboCurrentIndexChanged(0); - ui.FinalRenderOpenCLCheckBox->setChecked(false); - } + ui.FinalRenderOpenCLCheckBox->setChecked(m_Settings->FinalOpenCL()); } else { + table->setEnabled(false); ui.FinalRenderOpenCLCheckBox->setChecked(false); ui.FinalRenderOpenCLCheckBox->setEnabled(false); } @@ -128,7 +121,7 @@ FractoriumFinalRenderDialog::FractoriumFinalRenderDialog(FractoriumSettings* set else if (m_Settings->FinalThreadPriority() == THREAD_PRIORITY_HIGHEST) ui.FinalRenderThreadPriorityComboBox->setCurrentIndex(tpc); else - ui.FinalRenderThreadPriorityComboBox->setCurrentIndex(Clamp(0, tpc, m_Settings->FinalThreadPriority() / 25)); + ui.FinalRenderThreadPriorityComboBox->setCurrentIndex(Clamp(m_Settings->FinalThreadPriority() / 25, 0, tpc)); #endif m_QualitySpin->setValue(m_Settings->FinalQuality()); @@ -167,8 +160,7 @@ FractoriumFinalRenderDialog::FractoriumFinalRenderDialog(FractoriumSettings* set w = SetTabOrder(this, w, ui.FinalRenderDoAllCheckBox); w = SetTabOrder(this, w, ui.FinalRenderDoSequenceCheckBox); w = SetTabOrder(this, w, ui.FinalRenderCurrentSpin); - w = SetTabOrder(this, w, ui.FinalRenderPlatformCombo); - w = SetTabOrder(this, w, ui.FinalRenderDeviceCombo); + w = SetTabOrder(this, w, ui.DeviceTable); w = SetTabOrder(this, w, ui.FinalRenderThreadCountSpin); w = SetTabOrder(this, w, ui.FinalRenderThreadPriorityComboBox); w = SetTabOrder(this, w, ui.FinalRenderApplyToAllCheckBox); @@ -214,8 +206,6 @@ void FractoriumFinalRenderDialog::Path(const QString& s) { ui.FinalRenderParamsT QString FractoriumFinalRenderDialog::Prefix() { return m_PrefixEdit->text(); } QString FractoriumFinalRenderDialog::Suffix() { return m_SuffixEdit->text(); } uint FractoriumFinalRenderDialog::Current() { return ui.FinalRenderCurrentSpin->value(); } -uint FractoriumFinalRenderDialog::PlatformIndex() { return ui.FinalRenderPlatformCombo->currentIndex(); } -uint FractoriumFinalRenderDialog::DeviceIndex() { return ui.FinalRenderDeviceCombo->currentIndex(); } uint FractoriumFinalRenderDialog::ThreadCount() { return ui.FinalRenderThreadCountSpin->value(); } #ifdef _WIN32 int FractoriumFinalRenderDialog::ThreadPriority() { return ui.FinalRenderThreadPriorityComboBox->currentIndex() - 2; } @@ -236,6 +226,7 @@ double FractoriumFinalRenderDialog::Quality() { return m_QualitySpin->value(); } uint FractoriumFinalRenderDialog::TemporalSamples() { return m_TemporalSamplesSpin->value(); } uint FractoriumFinalRenderDialog::Supersample() { return m_SupersampleSpin->value(); } uint FractoriumFinalRenderDialog::Strips() { return m_StripsSpin->value(); } +QList FractoriumFinalRenderDialog::Devices() { return DeviceTableToSettings(ui.DeviceTable); } /// /// Capture the current state of the Gui. @@ -260,8 +251,7 @@ FinalRenderGuiState FractoriumFinalRenderDialog::State() state.m_Ext = Ext(); state.m_Prefix = Prefix(); state.m_Suffix = Suffix(); - state.m_PlatformIndex = PlatformIndex(); - state.m_DeviceIndex = DeviceIndex(); + state.m_Devices = Devices(); state.m_ThreadCount = ThreadCount(); state.m_ThreadPriority = ThreadPriority(); state.m_WidthScale = WidthScale(); @@ -348,14 +338,14 @@ void FractoriumFinalRenderDialog::OnTransparencyCheckBoxStateChanged(int state) /// /// Set whether to use OpenCL in the rendering process or not. +/// Also disable or enable the CPU and OpenCL related controls based on the state passed in. /// /// Use OpenCL if state == Qt::Checked, else don't. void FractoriumFinalRenderDialog::OnOpenCLCheckBoxStateChanged(int state) { bool checked = state == Qt::Checked; - ui.FinalRenderPlatformCombo->setEnabled(checked); - ui.FinalRenderDeviceCombo->setEnabled(checked); + ui.DeviceTable->setEnabled(checked); ui.FinalRenderThreadCountSpin->setEnabled(!checked); ui.FinalRenderThreadPriorityComboBox->setEnabled(!checked); SetMemory(); @@ -379,6 +369,9 @@ void FractoriumFinalRenderDialog::OnDoublePrecisionCheckBoxStateChanged(int stat /// The state of the checkbox void FractoriumFinalRenderDialog::OnDoAllCheckBoxStateChanged(int state) { + if (!state) + ui.FinalRenderDoSequenceCheckBox->setChecked(false); + ui.FinalRenderDoSequenceCheckBox->setEnabled(ui.FinalRenderDoAllCheckBox->isChecked()); } @@ -390,36 +383,28 @@ void FractoriumFinalRenderDialog::OnDoAllCheckBoxStateChanged(int state) /// The state of the checkbox void FractoriumFinalRenderDialog::OnDoSequenceCheckBoxStateChanged(int state) { - m_TemporalSamplesSpin->setEnabled(ui.FinalRenderDoSequenceCheckBox->isChecked()); + bool checked = ui.FinalRenderDoSequenceCheckBox->isChecked(); + + m_TemporalSamplesSpin->setEnabled(checked); + + if (checked) + m_StripsSpin->setValue(1); + + m_StripsSpin->setEnabled(!checked); + SetMemory(); } /// /// The current ember spinner was changed, update fields. /// /// Ignored -void FractoriumFinalRenderDialog::OnFinalRenderCurrentSpinChanged(int d) +void FractoriumFinalRenderDialog::OnCurrentSpinChanged(int d) { m_Controller->SetEmber(d - 1); m_Controller->SyncCurrentToGui(); SetMemory(); } -/// -/// Populate the the device combo box with all available -/// OpenCL devices for the selected platform. -/// Called when the platform combo box index changes. -/// -/// The selected index of the combo box -void FractoriumFinalRenderDialog::OnPlatformComboCurrentIndexChanged(int index) -{ - vector devices = m_Wrapper.DeviceNames(index); - - ui.FinalRenderDeviceCombo->clear(); - - for (auto& device : devices) - ui.FinalRenderDeviceCombo->addItem(QString::fromStdString(device)); -} - /// /// The apply all checkbox was changed. /// If checked, set values for all embers in the file to the values specified in the GUI. @@ -437,7 +422,7 @@ void FractoriumFinalRenderDialog::OnApplyAllCheckBoxStateChanged(int state) /// the height spinner as well to be in proportion. /// /// Ignored -void FractoriumFinalRenderDialog::OnFinalRenderWidthScaleChanged(double d) +void FractoriumFinalRenderDialog::OnWidthScaleChanged(double d) { if (ui.FinalRenderKeepAspectCheckBox->isChecked() && m_Controller.get()) m_HeightScaleSpin->SetValueStealth(m_WidthScaleSpin->value()); @@ -452,7 +437,7 @@ void FractoriumFinalRenderDialog::OnFinalRenderWidthScaleChanged(double d) /// the width spinner as well to be in proportion. /// /// Ignored -void FractoriumFinalRenderDialog::OnFinalRenderHeightScaleChanged(double d) +void FractoriumFinalRenderDialog::OnHeightScaleChanged(double d) { if (ui.FinalRenderKeepAspectCheckBox->isChecked() && m_Controller.get()) m_WidthScaleSpin->SetValueStealth(m_HeightScaleSpin->value()); @@ -484,6 +469,50 @@ void FractoriumFinalRenderDialog::OnScaleRadioButtonChanged(bool checked) SetMemory(); } +/// +/// The check state of one of the OpenCL devices was changed. +/// This does a special check to always ensure at least one device, +/// as well as one primary is checked. +/// +/// The row of the cell +/// The column of the cell +void FractoriumFinalRenderDialog::OnDeviceTableCellChanged(int row, int col) +{ + if (auto item = ui.DeviceTable->item(row, col)) + { + HandleDeviceTableCheckChanged(ui.DeviceTable, row, col); + SetMemory(); + } +} + +/// +/// The primary device radio button selection was changed. +/// If the device was specified as primary, but was not selected +/// for inclusion, it will automatically be selected for inclusion. +/// +/// The state of the radio button +void FractoriumFinalRenderDialog::OnDeviceTableRadioToggled(bool checked) +{ + int row; + auto s = sender(); + auto table = ui.DeviceTable; + QRadioButton* radio = nullptr; + + if (s) + { + for (row = 0; row < table->rowCount(); row++) + if (radio = qobject_cast(table->cellWidget(row, 1))) + if (s == radio) + { + HandleDeviceTableCheckChanged(ui.DeviceTable, row, 1); + break; + } + } + + if (checked) + SetMemory(); +} + /// /// The quality spinner was changed, recompute required memory. /// @@ -558,7 +587,7 @@ void FractoriumFinalRenderDialog::OnShowFolderButtonClicked(bool checked) /// number of channels used in the final output buffer. /// /// Ignored -void FractoriumFinalRenderDialog::OnFinalRenderExtIndexChanged(int d) +void FractoriumFinalRenderDialog::OnExtIndexChanged(int d) { if (SetMemory()) Path(m_Controller->ComposePath(m_Controller->Name())); @@ -568,7 +597,7 @@ void FractoriumFinalRenderDialog::OnFinalRenderExtIndexChanged(int d) /// Change the prefix prepended to the output file name. /// /// Ignored -void FractoriumFinalRenderDialog::OnFinalRenderPrefixChanged(const QString& s) +void FractoriumFinalRenderDialog::OnPrefixChanged(const QString& s) { Path(m_Controller->ComposePath(m_Controller->Name())); } @@ -577,7 +606,7 @@ void FractoriumFinalRenderDialog::OnFinalRenderPrefixChanged(const QString& s) /// Change the suffix appended to the output file name. /// /// Ignored -void FractoriumFinalRenderDialog::OnFinalRenderSuffixChanged(const QString& s) +void FractoriumFinalRenderDialog::OnSuffixChanged(const QString& s) { Path(m_Controller->ComposePath(m_Controller->Name())); } @@ -637,7 +666,7 @@ void FractoriumFinalRenderDialog::showEvent(QShowEvent* e) ui.FinalRenderCurrentSpin->blockSignals(true); ui.FinalRenderCurrentSpin->setValue(index);//Set the currently selected ember to the one that was being edited. ui.FinalRenderCurrentSpin->blockSignals(false); - OnFinalRenderCurrentSpinChanged(index);//Force update in case the ember was new, but at the same index as the previous one. + OnCurrentSpinChanged(index);//Force update in case the ember was new, but at the same index as the previous one. m_Controller->m_ImageCount = 0; SetMemory(); m_Controller->ResetProgress(); @@ -655,7 +684,7 @@ void FractoriumFinalRenderDialog::showEvent(QShowEvent* e) /// /// Close the dialog without running, or if running, cancel and exit. /// Settings will not be saved. -/// Control will be returned to Fractorium::OnActionFinalRender(). +/// Control will be returned to Fractorium::OnActiOn(). /// void FractoriumFinalRenderDialog::reject() { @@ -736,48 +765,58 @@ bool FractoriumFinalRenderDialog::CreateControllerFromGUI(bool createRenderer) /// /// Compute the amount of memory needed via call to SyncAndComputeMemory(), then /// assign the result to the table cell as text. +/// Report errors if not enough memory is available for any of the selected devices. /// +/// True devices and a controller is present, else false. bool FractoriumFinalRenderDialog::SetMemory() { if (isVisible() && CreateControllerFromGUI(true)) { bool error = false; tuple p = m_Controller->SyncAndComputeMemory(); + QString s; ui.FinalRenderParamsTable->item(m_MemoryCellIndex, 1)->setText(ToString(get<1>(p))); ui.FinalRenderParamsTable->item(m_ItersCellIndex, 1)->setText(ToString(get<2>(p))); - - if (OpenCL()) + + if (OpenCL() && !m_Wrappers.empty()) { - if (!m_Wrapper.Ok() || PlatformIndex() != m_Wrapper.PlatformIndex() || DeviceIndex() != m_Wrapper.DeviceIndex()) - m_Wrapper.Init(PlatformIndex(), DeviceIndex()); + auto devices = Devices(); - if (m_Wrapper.Ok()) + for (size_t i = 0; i < m_Wrappers.size(); i++) { - size_t histSize = get<0>(p); - size_t totalSize = get<1>(p); - size_t maxAlloc = m_Wrapper.MaxAllocSize(); - size_t totalAvail = m_Wrapper.GlobalMemSize(); - QString s; - - if (histSize > maxAlloc) + if (devices.contains(int(i))) { - s = "Histogram/Accumulator memory size of " + ToString(histSize) + - " is greater than the max OpenCL allocation size of " + ToString(maxAlloc); - } + size_t histSize = get<0>(p); + size_t totalSize = get<1>(p); + size_t maxAlloc = m_Wrappers[i].MaxAllocSize(); + size_t totalAvail = m_Wrappers[i].GlobalMemSize(); + QString temp; - if (totalSize > totalAvail) - { - s += "\n\nTotal required memory size of " + ToString(totalSize) + - " is greater than the max OpenCL available memory of " + ToString(totalAvail); - } + if (histSize > maxAlloc) + { + temp = "Histogram/Accumulator memory size of " + ToString(histSize) + + " is greater than the max OpenCL allocation size of " + ToString(maxAlloc); + } - if (!s.isEmpty()) - { - error = true; - ui.FinalRenderTextOutput->setText(s + ".\n\nRendering will most likely fail."); + if (totalSize > totalAvail) + { + temp += "\n\nTotal required memory size of " + ToString(totalSize) + + " is greater than the max OpenCL available memory of " + ToString(totalAvail); + } + + if (!temp.isEmpty()) + { + error = true; + s += QString::fromStdString(m_Wrappers[i].DeviceName()) + ":\n" + temp + "\n\n"; + } } } + + if (!s.isEmpty()) + s += "Rendering will most likely fail."; + + ui.FinalRenderTextOutput->setText(s); } if (!error) diff --git a/Source/Fractorium/FinalRenderDialog.h b/Source/Fractorium/FinalRenderDialog.h index 57f1eb5..4833843 100644 --- a/Source/Fractorium/FinalRenderDialog.h +++ b/Source/Fractorium/FinalRenderDialog.h @@ -65,8 +65,6 @@ public: QString Prefix(); QString Suffix(); uint Current(); - uint PlatformIndex(); - uint DeviceIndex(); uint ThreadCount(); int ThreadPriority(); double WidthScale(); @@ -75,6 +73,7 @@ public: uint TemporalSamples(); uint Supersample(); uint Strips(); + QList Devices(); FinalRenderGuiState State(); public slots: @@ -86,22 +85,23 @@ public slots: void OnDoublePrecisionCheckBoxStateChanged(int state); void OnDoAllCheckBoxStateChanged(int state); void OnDoSequenceCheckBoxStateChanged(int state); - void OnFinalRenderCurrentSpinChanged(int d); - void OnPlatformComboCurrentIndexChanged(int index); + void OnCurrentSpinChanged(int d); void OnApplyAllCheckBoxStateChanged(int state); - void OnFinalRenderWidthScaleChanged(double d); - void OnFinalRenderHeightScaleChanged(double d); + void OnWidthScaleChanged(double d); + void OnHeightScaleChanged(double d); void OnKeepAspectCheckBoxStateChanged(int state); void OnScaleRadioButtonChanged(bool checked); + void OnDeviceTableCellChanged(int row, int col); + void OnDeviceTableRadioToggled(bool checked); void OnQualityChanged(double d); void OnTemporalSamplesChanged(int d); void OnSupersampleChanged(int d); void OnStripsChanged(int d); void OnFileButtonClicked(bool checked); void OnShowFolderButtonClicked(bool checked); - void OnFinalRenderExtIndexChanged(int d); - void OnFinalRenderPrefixChanged(const QString& s); - void OnFinalRenderSuffixChanged(const QString& s); + void OnExtIndexChanged(int d); + void OnPrefixChanged(const QString& s); + void OnSuffixChanged(const QString& s); void OnRenderClicked(bool checked); void OnCancelRenderClicked(bool checked); virtual void reject() override; @@ -116,7 +116,6 @@ private: int m_MemoryCellIndex; int m_ItersCellIndex; int m_PathCellIndex; - OpenCLWrapper m_Wrapper; Timing m_RenderTimer; DoubleSpinBox* m_WidthScaleSpin; DoubleSpinBox* m_HeightScaleSpin; @@ -129,6 +128,8 @@ private: QLineEdit* m_SuffixEdit; FractoriumSettings* m_Settings; Fractorium* m_Fractorium; + OpenCLInfo& m_Info; + vector m_Wrappers; unique_ptr m_Controller; Ui::FinalRenderDialog ui; }; diff --git a/Source/Fractorium/FinalRenderDialog.ui b/Source/Fractorium/FinalRenderDialog.ui index 058657e..cc8c78e 100644 --- a/Source/Fractorium/FinalRenderDialog.ui +++ b/Source/Fractorium/FinalRenderDialog.ui @@ -7,7 +7,7 @@ 0 0 519 - 897 + 899 @@ -64,7 +64,7 @@ 0 0 507 - 885 + 887 @@ -240,9 +240,9 @@ - + - + 0 0 @@ -250,37 +250,81 @@ 0 - 0 + 91 16777215 - 16777215 + 91 - - - - - - - 0 - 0 - + + Qt::NoFocus - - - 0 - 0 - + + QAbstractItemView::NoEditTriggers - - - 16777215 - 16777215 - + + QAbstractItemView::NoSelection + + QAbstractItemView::SelectRows + + + true + + + 3 + + + 60 + + + true + + + false + + + 22 + + + false + + + 22 + + + + AMD + + + + + Nvidia + + + + + Intel + + + + + Use + + + + + Primary + + + + + Device + + @@ -1119,8 +1163,6 @@ FinalRenderYAxisUpCheckBox FinalRenderTransparencyCheckBox FinalRenderOpenCLCheckBox - FinalRenderPlatformCombo - FinalRenderDeviceCombo FinalRenderParamsTable FinalRenderTextOutput StartRenderButton diff --git a/Source/Fractorium/FinalRenderEmberController.cpp b/Source/Fractorium/FinalRenderEmberController.cpp index ec5060a..83a2ff0 100644 --- a/Source/Fractorium/FinalRenderEmberController.cpp +++ b/Source/Fractorium/FinalRenderEmberController.cpp @@ -15,7 +15,7 @@ FinalRenderEmberControllerBase::FinalRenderEmberControllerBase(FractoriumFinalRe m_Run = false; m_PreviewRun = false; m_ImageCount = 0; - m_FinishedImageCount = 0; + m_FinishedImageCount.store(0); m_FinalRenderDialog = finalRenderDialog; m_Settings = m_Fractorium->m_Settings; } @@ -26,7 +26,8 @@ FinalRenderEmberControllerBase::FinalRenderEmberControllerBase(FractoriumFinalRe /// It should never take longer than a few milliseconds because the /// renderer checks the m_Abort flag in many places during the process. /// -void FinalRenderEmberControllerBase::CancelRender() +template +void FinalRenderEmberController::CancelRender() { if (m_Result.isRunning()) { @@ -48,6 +49,21 @@ void FinalRenderEmberControllerBase::CancelRender() m_Renderer->LeaveFinalAccum(); m_Renderer->LeaveRender(); } + else + { + for (auto& renderer : m_Renderers) + { + renderer->Abort(); + + while (renderer->InRender()) + QApplication::processEvents(); + + renderer->EnterRender(); + renderer->EnterFinalAccum(); + renderer->LeaveFinalAccum(); + renderer->LeaveRender(); + } + } }); g.wait(); @@ -66,11 +82,11 @@ void FinalRenderEmberControllerBase::CancelRender() /// True if a valid renderer is created or if no action is taken, else false. bool FinalRenderEmberControllerBase::CreateRendererFromGUI() { - bool useOpenCL = m_Wrapper.CheckOpenCL() && m_FinalRenderDialog->OpenCL(); + bool useOpenCL = m_Info.Ok() && m_FinalRenderDialog->OpenCL(); + auto v = Devices(m_FinalRenderDialog->Devices()); - return CreateRenderer(useOpenCL ? OPENCL_RENDERER : CPU_RENDERER, - m_FinalRenderDialog->PlatformIndex(), - m_FinalRenderDialog->DeviceIndex(), + return CreateRenderer((useOpenCL && !v.empty()) ? OPENCL_RENDERER : CPU_RENDERER, + v, false);//Not shared. } @@ -102,9 +118,9 @@ FinalRenderEmberController::FinalRenderEmberController(FractoriumFinalRenderD m_PreviewRun = true; m_FinalPreviewRenderer->Abort(); - QLabel* widget = m_FinalRenderDialog->ui.FinalRenderPreviewLabel; - size_t maxDim = 100; T scalePercentage; + size_t maxDim = 100; + QLabel* widget = m_FinalRenderDialog->ui.FinalRenderPreviewLabel; //Determine how to scale the scaled ember to fit in the label with a max of 100x100. if (m_Ember->m_FinalRasW >= m_Ember->m_FinalRasH) @@ -125,7 +141,7 @@ FinalRenderEmberController::FinalRenderEmberController(FractoriumFinalRenderD m_FinalPreviewRenderer->SetEmber(m_PreviewEmber); m_FinalPreviewRenderer->PrepFinalAccumVector(m_PreviewFinalImage);//Must manually call this first because it could be erroneously made smaller due to strips if called inside Renderer::Run(). - uint strips = VerifyStrips(m_PreviewEmber.m_FinalRasH, m_FinalRenderDialog->Strips(), + auto strips = VerifyStrips(m_PreviewEmber.m_FinalRasH, m_FinalRenderDialog->Strips(), [&](const string& s) { }, [&](const string& s) { }, [&](const string& s) { }); StripsRender(m_FinalPreviewRenderer.get(), m_PreviewEmber, m_PreviewFinalImage, 0, strips, m_FinalRenderDialog->YAxisUp(), @@ -152,11 +168,10 @@ FinalRenderEmberController::FinalRenderEmberController(FractoriumFinalRenderD m_Run = true; m_TotalTimer.Tic();//Begin timing for progress of all operations. m_GuiState = m_FinalRenderDialog->State();//Cache render settings from the GUI before running. - m_FinalImageIndex = 0; size_t i; bool doAll = m_GuiState.m_DoAll && m_EmberFile.Size() > 1; - uint currentStripForProgress = 0;//Sort of a hack to get the strip value to the progress function. + size_t currentStripForProgress = 0;//Sort of a hack to get the strip value to the progress function. QString path = doAll ? ComposePath(QString::fromStdString(m_EmberFile.m_Embers[0].m_Name)) : ComposePath(Name()); QString backup = path + "_backup.flame"; @@ -166,32 +181,23 @@ FinalRenderEmberController::FinalRenderEmberController(FractoriumFinalRenderD else m_XmlWriter.Save(backup.toStdString().c_str(), *m_Ember, 0, true, false, true); - m_FinishedImageCount = 0; - m_Renderer->EarlyClip(m_GuiState.m_EarlyClip); - m_Renderer->YAxisUp(m_GuiState.m_YAxisUp); - m_Renderer->ThreadCount(m_GuiState.m_ThreadCount); - m_Renderer->Priority((eThreadPriority)m_GuiState.m_ThreadPriority); - m_Renderer->Transparency(m_GuiState.m_Transparency); - m_Renderer->m_ProgressParameter = reinterpret_cast(¤tStripForProgress); - - if (path.endsWith(".png", Qt::CaseInsensitive) || m_Renderer->RendererType() == OPENCL_RENDERER) - m_Renderer->NumChannels(4); - else - m_Renderer->NumChannels(3); + m_FinishedImageCount.store(0); + SyncGuiToRenderer(); + FirstOrDefaultRenderer()->m_ProgressParameter = reinterpret_cast(¤tStripForProgress);//When animating, only the first (primary) device has a progress parameter. m_GuiState.m_Strips = VerifyStrips(m_Ember->m_FinalRasH, m_GuiState.m_Strips, [&](const string& s) { Output(QString::fromStdString(s)); },//Greater than height. [&](const string& s) { Output(QString::fromStdString(s)); },//Mod height != 0. [&](const string& s) { Output(QString::fromStdString(s) + "\n"); });//Final strips value to be set. + ResetProgress(); //The rendering process is different between doing a single image, and doing multiple. if (doAll) { m_ImageCount = m_EmberFile.Size(); - ResetProgress(); //Different action required for rendering as animation or not. - if (m_GuiState.m_DoSequence) + if (m_GuiState.m_DoSequence && !m_Renderers.empty()) { Ember* firstEmber = &m_EmberFile.m_Embers[0]; @@ -213,58 +219,107 @@ FinalRenderEmberController::FinalRenderEmberController(FractoriumFinalRenderD m_EmberFile.m_Embers[i].m_TemporalSamples = m_GuiState.m_TemporalSamples; } - //Not supporting strips with motion blur. - //Shouldn't be a problem because animations will be at max 4k x 4k which will take about 1.1GB - //even when using double precision, which most cards at the time of this writing already exceed. + std::atomic atomfTime; + vector threadVec; + + //Not supporting strips with animation. + //Shouldn't be a problem because animations will be at max 4k x 2k which will take about 1GB + //even when using double precision, which most cards at the time of this writing already exceed. m_GuiState.m_Strips = 1; - m_Renderer->SetEmber(m_EmberFile.m_Embers);//Copy all embers to the local storage inside the renderer. - uint finalImageIndex = m_FinalImageIndex; + atomfTime.store(0); - //Render each image, cancelling if m_Run ever gets set to false. - for (i = 0; i < m_EmberFile.Size() && m_Run; i++) + std::function iterFunc = [&](size_t index) { - Output("Image " + ToString(m_FinishedImageCount) + ":\n" + ComposePath(QString::fromStdString(m_EmberFile.m_Embers[i].m_Name))); - m_Renderer->Reset();//Have to manually set this since the ember is not set each time through. - m_RenderTimer.Tic();//Toc() is called in RenderComplete(). + size_t ftime; + size_t finalImageIndex = 0; + std::thread writeThread; + vector finalImages[2]; + EmberStats stats; + EmberImageComments comments; + Timing renderTimer; + auto renderer = m_Renderers[index].get(); + renderer->SetEmber(m_EmberFile.m_Embers);//Copy all embers to the local storage inside the renderer. - //Can't use strips render here. Run() must be called directly for animation. - if (m_Renderer->Run(m_FinalImage[finalImageIndex], i) != RENDER_OK) + //Render each image, cancelling if m_Run ever gets set to false. + while (atomfTime.fetch_add(1), ((ftime = atomfTime.load() - 1) < m_EmberFile.Size()) && m_Run)//Needed to set 1 to claim this iter from other threads, so decrement it to be zero-indexed here. { - Output("Rendering failed.\n"); - m_Fractorium->ErrorReportToQTextEdit(m_Renderer->ErrorReport(), m_FinalRenderDialog->ui.FinalRenderTextOutput, false);//Internally calls invoke. - } - else - { - if (m_WriteThread.joinable()) - m_WriteThread.join(); + T localTime = T(ftime); - SetProgressComplete(100); - m_Stats = m_Renderer->Stats(); - m_FinalImageIndex = finalImageIndex;//Will be used inside of RenderComplete(). Set here when no threads are running. - //RenderComplete(m_EmberFile.m_Embers[i]);//Non-threaded version for testing. - m_WriteThread = std::thread([&] { RenderComplete(m_EmberFile.m_Embers[i]); }); + Output("Image " + ToString(ftime + 1ULL) + ":\n" + ComposePath(QString::fromStdString(m_EmberFile.m_Embers[ftime].m_Name))); + renderer->Reset();//Have to manually set this since the ember is not set each time through. + renderTimer.Tic();//Toc() is called in RenderComplete(). + + //Can't use strips render here. Run() must be called directly for animation. + if (renderer->Run(finalImages[finalImageIndex], localTime) != RENDER_OK) + { + Output("Rendering failed.\n"); + m_Fractorium->ErrorReportToQTextEdit(renderer->ErrorReport(), m_FinalRenderDialog->ui.FinalRenderTextOutput, false);//Internally calls invoke. + atomfTime.store(m_EmberFile.Size() + 1);//Abort all threads if any of them encounter an error. + break; + } + else + { + if (writeThread.joinable()) + writeThread.join(); + + stats = renderer->Stats(); + comments = renderer->ImageComments(stats, 0, false, true); + + writeThread = std::thread([&](size_t tempTime, size_t threadFinalImageIndex) + { + SaveCurrentRender(m_EmberFile.m_Embers[tempTime], + comments,//These all don't change during the renders, so it's ok to access them in the thread. + finalImages[threadFinalImageIndex], + renderer->FinalRasW(), + renderer->FinalRasH(), + renderer->NumChannels(), + renderer->BytesPerChannel()); + }, ftime, finalImageIndex); + + m_FinishedImageCount.fetch_add(1); + RenderComplete(m_EmberFile.m_Embers[ftime], stats, renderTimer); + + if (!index)//Only first device has a progress callback, so it also makes sense to only manually set the progress on the first device as well. + HandleFinishedProgress(); + } + + finalImageIndex ^= 1;//Toggle the index. } - finalImageIndex ^= 1;//Toggle the index. + if (writeThread.joinable())//One final check to make sure all writing is done before exiting this thread. + writeThread.join(); + }; + + threadVec.reserve(m_Renderers.size()); + + for (size_t r = 0; r < m_Renderers.size(); r++) + { + threadVec.push_back(std::thread([&](size_t index) + { + iterFunc(index); + }, r)); } - if (m_WriteThread.joinable()) - m_WriteThread.join(); + for (auto& th : threadVec) + if (th.joinable()) + th.join(); + + HandleFinishedProgress();//One final check that all images were finished. } - else//Render all images, but not as an animation sequence (without temporal samples motion blur). + else if (m_Renderer.get())//Make sure a renderer was created and render all images, but not as an animation sequence (without temporal samples motion blur). { //Render each image, cancelling if m_Run ever gets set to false. for (i = 0; i < m_EmberFile.Size() && m_Run; i++) { - Output("Image " + ToString(m_FinishedImageCount) + ":\n" + ComposePath(QString::fromStdString(m_EmberFile.m_Embers[i].m_Name))); + Output("Image " + ToString(m_FinishedImageCount.load() + 1) + ":\n" + ComposePath(QString::fromStdString(m_EmberFile.m_Embers[i].m_Name))); m_EmberFile.m_Embers[i].m_TemporalSamples = 1;//No temporal sampling. m_Renderer->SetEmber(m_EmberFile.m_Embers[i]); - m_Renderer->PrepFinalAccumVector(m_FinalImage[m_FinalImageIndex]);//Must manually call this first because it could be erroneously made smaller due to strips if called inside Renderer::Run(). + m_Renderer->PrepFinalAccumVector(m_FinalImage);//Must manually call this first because it could be erroneously made smaller due to strips if called inside Renderer::Run(). m_Stats.Clear(); - Memset(m_FinalImage[m_FinalImageIndex]); + Memset(m_FinalImage); m_RenderTimer.Tic();//Toc() is called in RenderComplete(). - StripsRender(m_Renderer.get(), m_EmberFile.m_Embers[i], m_FinalImage[m_FinalImageIndex], 0, m_GuiState.m_Strips, m_GuiState.m_YAxisUp, + StripsRender(m_Renderer.get(), m_EmberFile.m_Embers[i], m_FinalImage, 0, m_GuiState.m_Strips, m_GuiState.m_YAxisUp, [&](size_t strip) { currentStripForProgress = strip; },//Pre strip. [&](size_t strip) { m_Stats += m_Renderer->Stats(); },//Post strip. [&](size_t strip)//Error. @@ -272,23 +327,32 @@ FinalRenderEmberController::FinalRenderEmberController(FractoriumFinalRenderD Output("Rendering failed.\n"); m_Fractorium->ErrorReportToQTextEdit(m_Renderer->ErrorReport(), m_FinalRenderDialog->ui.FinalRenderTextOutput, false);//Internally calls invoke. }, - [&](Ember& finalEmber) { RenderComplete(finalEmber); });//Final strip. + [&](Ember& finalEmber) + { + m_FinishedImageCount.fetch_add(1); + SaveCurrentRender(finalEmber); + RenderComplete(finalEmber); + HandleFinishedProgress(); + });//Final strip. } } + else + { + Output("No renderer present, aborting."); + } } - else//Render a single image. + else if (m_Renderer.get())//Render a single image. { m_ImageCount = 1; - ResetProgress(); m_Ember->m_TemporalSamples = 1; m_Renderer->SetEmber(*m_Ember); - m_Renderer->PrepFinalAccumVector(m_FinalImage[m_FinalImageIndex]);//Must manually call this first because it could be erroneously made smaller due to strips if called inside Renderer::Run(). + m_Renderer->PrepFinalAccumVector(m_FinalImage);//Must manually call this first because it could be erroneously made smaller due to strips if called inside Renderer::Run(). m_Stats.Clear(); - Memset(m_FinalImage[m_FinalImageIndex]); + Memset(m_FinalImage); Output(ComposePath(QString::fromStdString(m_Ember->m_Name))); m_RenderTimer.Tic();//Toc() is called in RenderComplete(). - StripsRender(m_Renderer.get(), *m_Ember, m_FinalImage[m_FinalImageIndex], 0, m_GuiState.m_Strips, m_GuiState.m_YAxisUp, + StripsRender(m_Renderer.get(), *m_Ember, m_FinalImage, 0, m_GuiState.m_Strips, m_GuiState.m_YAxisUp, [&](size_t strip) { currentStripForProgress = strip; },//Pre strip. [&](size_t strip) { m_Stats += m_Renderer->Stats(); },//Post strip. [&](size_t strip)//Error. @@ -296,10 +360,19 @@ FinalRenderEmberController::FinalRenderEmberController(FractoriumFinalRenderD Output("Rendering failed.\n"); m_Fractorium->ErrorReportToQTextEdit(m_Renderer->ErrorReport(), m_FinalRenderDialog->ui.FinalRenderTextOutput, false);//Internally calls invoke. }, - [&](Ember& finalEmber) { RenderComplete(finalEmber); });//Final strip. + [&](Ember& finalEmber) + { + m_FinishedImageCount.fetch_add(1); + SaveCurrentRender(finalEmber); + RenderComplete(finalEmber); + HandleFinishedProgress(); + });//Final strip. + } + else + { + Output("No renderer present, aborting."); } - m_FinalImageIndex = 0; QString totalTimeString = "All renders completed in: " + QString::fromStdString(m_TotalTimer.Format(m_TotalTimer.Toc())) + "."; Output(totalTimeString); @@ -403,35 +476,43 @@ bool FinalRenderEmberController::Render() /// Stop rendering and initialize a new renderer, using the specified type and the options on the final render dialog. /// /// The type of render to create -/// The index platform of the platform to use -/// The index device of the device to use -/// The texture ID of the shared OpenGL texture if shared -/// True if shared with OpenGL, else false. Default: true. +/// The platform,device index pairs of the devices to use +/// True if shared with OpenGL, else false. Always false in this case. /// True if nothing went wrong, else false. template -bool FinalRenderEmberController::CreateRenderer(eRendererType renderType, uint platform, uint device, bool shared) +bool FinalRenderEmberController::CreateRenderer(eRendererType renderType, const vector>& devices, bool shared) { bool ok = true; - uint channels = m_FinalRenderDialog->Ext() == "png" ? 4 : 3; + bool deviceDiff = false; + //uint channels = m_FinalRenderDialog->Ext().endsWith("png", Qt::CaseInsensitive) ? 4 : 3; + bool renderTypeMismatch = (m_Renderer.get() && (m_Renderer->RendererType() != renderType)) || + (!m_Renderers.empty() && (m_Renderers[0]->RendererType() != renderType)); CancelRender(); - if (!m_Renderer.get() || - !m_Renderer->Ok() || - m_Renderer->RendererType() != renderType || - m_Platform != platform || - m_Device != device || - m_Shared != shared) + if ((!m_FinalRenderDialog->DoSequence() && (!m_Renderer.get() || !m_Renderer->Ok())) || + (m_FinalRenderDialog->DoSequence() && m_Renderers.empty()) || + renderTypeMismatch || + !Equal(m_Devices, devices)) { EmberReport emberReport; vector errorReport; - m_Platform = platform;//Store values for re-creation later on. - m_Device = device; + m_Devices = devices;//Store values for re-creation later on. m_OutputTexID = 0;//Don't care about tex ID when doing final render. - m_Shared = shared; + m_Shared = shared;//So shared is of course false. + + if (m_FinalRenderDialog->DoSequence()) + { + m_Renderer.reset(); + m_Renderers = ::CreateRenderers(renderType, m_Devices, shared, m_OutputTexID, emberReport); + } + else + { + m_Renderers.clear(); + m_Renderer = unique_ptr(::CreateRenderer(renderType, m_Devices, shared, m_OutputTexID, emberReport)); + } - m_Renderer = unique_ptr(::CreateRenderer(renderType, platform, device, shared, m_OutputTexID, emberReport)); errorReport = emberReport.ErrorReport(); if (!errorReport.empty()) @@ -442,30 +523,13 @@ bool FinalRenderEmberController::CreateRenderer(eRendererType renderType, uin } } - if (m_Renderer.get()) - { - if (m_Renderer->RendererType() == OPENCL_RENDERER) - channels = 4;//Always using 4 since the GL texture is RGBA. - - m_Renderer->Callback(this); - m_Renderer->NumChannels(channels); - m_Renderer->EarlyClip(m_FinalRenderDialog->EarlyClip()); - m_Renderer->YAxisUp(m_FinalRenderDialog->YAxisUp()); - m_Renderer->ThreadCount(m_FinalRenderDialog->ThreadCount()); - m_Renderer->Transparency(m_FinalRenderDialog->Transparency()); - } - else - { - ok = false; - m_Fractorium->ShowCritical("Renderer Creation Error", "Could not create renderer, aborting. See info tab for details."); - } - - return ok; + return SyncGuiToRenderer() && ok; } /// /// Progress function. /// Take special action to sync options upon finishing. +/// Note this is only called on the primary renderer. /// /// The ember currently being rendered /// An extra dummy parameter @@ -477,7 +541,7 @@ template int FinalRenderEmberController::ProgressFunc(Ember& ember, void* foo, double fraction, int stage, double etaMs) { static int count = 0; - uint strip = *(reinterpret_cast(m_Renderer->m_ProgressParameter)); + size_t strip = *(reinterpret_cast(FirstOrDefaultRenderer()->m_ProgressParameter)); double fracPerStrip = ceil(100.0 / m_GuiState.m_Strips); double stripsfrac = ceil(fracPerStrip * strip) + ceil(fraction / m_GuiState.m_Strips); int intFract = int(stripsfrac); @@ -489,7 +553,7 @@ int FinalRenderEmberController::ProgressFunc(Ember& ember, void* foo, doub else if (stage == 2) QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderAccumProgress, "setValue", Qt::QueuedConnection, Q_ARG(int, intFract)); - QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderImageCountLabel, "setText", Qt::QueuedConnection, Q_ARG(const QString&, ToString(m_FinishedImageCount) + " / " + ToString(m_ImageCount) + " Eta: " + QString::fromStdString(m_RenderTimer.Format(etaMs)))); + QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderImageCountLabel, "setText", Qt::QueuedConnection, Q_ARG(const QString&, ToString(m_FinishedImageCount.load() + 1) + " / " + ToString(m_ImageCount) + " Eta: " + QString::fromStdString(m_RenderTimer.Format(etaMs)))); QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderTextOutput, "update", Qt::QueuedConnection); return m_Run ? 1 : 0; @@ -533,6 +597,53 @@ void FinalRenderEmberController::SyncGuiToEmbers(size_t widthOverride, size_t } } +/// +/// Copy GUI values to the renderers. +/// +template +bool FinalRenderEmberController::SyncGuiToRenderer() +{ + bool ok = true; + uint channels = m_FinalRenderDialog->Ext().endsWith("png", Qt::CaseInsensitive) ? 4 : 3; + + if (m_Renderer.get()) + { + if (m_Renderer->RendererType() == OPENCL_RENDERER) + channels = 4;//Always using 4 since the GL texture is RGBA. + + m_Renderer->Callback(this); + m_Renderer->NumChannels(channels); + m_Renderer->EarlyClip(m_FinalRenderDialog->EarlyClip()); + m_Renderer->YAxisUp(m_FinalRenderDialog->YAxisUp()); + m_Renderer->ThreadCount(m_FinalRenderDialog->ThreadCount()); + m_Renderer->Priority((eThreadPriority)m_FinalRenderDialog->ThreadPriority()); + m_Renderer->Transparency(m_FinalRenderDialog->Transparency()); + } + else if (!m_Renderers.empty()) + { + for (size_t i = 0; i < m_Renderers.size(); i++) + { + if (m_Renderers[i]->RendererType() == OPENCL_RENDERER) + channels = 4;//Always using 4 since the GL texture is RGBA. + + m_Renderers[i]->Callback(!i ? this : nullptr); + m_Renderers[i]->NumChannels(channels); + m_Renderers[i]->EarlyClip(m_FinalRenderDialog->EarlyClip()); + m_Renderers[i]->YAxisUp(m_FinalRenderDialog->YAxisUp()); + m_Renderers[i]->ThreadCount(m_FinalRenderDialog->ThreadCount()); + m_Renderers[i]->Priority((eThreadPriority)m_FinalRenderDialog->ThreadPriority()); + m_Renderers[i]->Transparency(m_FinalRenderDialog->Transparency()); + } + } + else + { + ok = false; + m_Fractorium->ShowCritical("Renderer Creation Error", "No renderer present, aborting. See info tab for details."); + } + + return ok; +} + /// /// Set values for scale spinners based on the ratio of the original dimensions to the current dimensions /// of the current ember. Also update the size suffix text. @@ -564,7 +675,7 @@ void FinalRenderEmberController::ResetProgress(bool total) { if (total) { - QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderImageCountLabel, "setText", Qt::QueuedConnection, Q_ARG(const QString&, "0 / " + ToString(m_ImageCount))); + QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderImageCountLabel, "setText", Qt::QueuedConnection, Q_ARG(const QString&, "0 / " + ToString(m_ImageCount))); QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderTotalProgress, "setValue", Qt::QueuedConnection, Q_ARG(int, 0)); } @@ -574,7 +685,7 @@ void FinalRenderEmberController::ResetProgress(bool total) } /// -/// Set various parameters in the renderer and current ember with the values +/// Set various parameters in the renderers and current ember with the values /// specified in the widgets and compute the amount of memory required to render. /// This includes the memory needed for the final output image. /// @@ -584,16 +695,17 @@ tuple FinalRenderEmberController::SyncAndComputeMemor { size_t iterCount; pair p(0, 0); - + size_t strips; + bool b = false; + uint channels = m_FinalRenderDialog->Ext() == "png" ? 4 : 3;//4 channels for Png, else 3. + + SyncGuiToEmbers(); + if (m_Renderer.get()) { - bool b = false; - uint channels = m_FinalRenderDialog->Ext() == "png" ? 4 : 3;//4 channels for Png, else 3. - size_t strips = VerifyStrips(m_Ember->m_FinalRasH, m_FinalRenderDialog->Strips(), - [&](const string& s) { }, [&](const string& s) { }, [&](const string& s) { }); + strips = VerifyStrips(m_Ember->m_FinalRasH, m_FinalRenderDialog->Strips(), + [&](const string& s) {}, [&](const string& s) {}, [&](const string& s) {}); - SyncGuiToEmbers(); - m_FinalRenderDialog->m_StripsSpin->setSuffix(" (" + ToString(strips) + ")"); m_Renderer->SetEmber(*m_Ember); m_Renderer->CreateSpatialFilter(b); m_Renderer->CreateTemporalFilter(b); @@ -607,6 +719,28 @@ tuple FinalRenderEmberController::SyncAndComputeMemor p = m_Renderer->MemoryRequired(strips, true, m_FinalRenderDialog->DoSequence()); iterCount = m_Renderer->TotalIterCount(strips); } + else if (!m_Renderers.empty()) + { + for (auto& renderer : m_Renderers) + { + renderer->SetEmber(*m_Ember); + renderer->CreateSpatialFilter(b); + renderer->CreateTemporalFilter(b); + renderer->NumChannels(channels); + renderer->ComputeBounds(); + renderer->ComputeQuality(); + renderer->ComputeCamera(); + } + + CancelPreviewRender(); + m_FinalPreviewRenderFunc(); + + strips = 1; + p = m_Renderers[0]->MemoryRequired(1, true, m_FinalRenderDialog->DoSequence()); + iterCount = m_Renderers[0]->TotalIterCount(strips); + } + + m_FinalRenderDialog->m_StripsSpin->setSuffix(" (" + ToString(strips) +")"); return tuple(p.first, p.second, iterCount); } @@ -631,6 +765,25 @@ QString FinalRenderEmberController::ComposePath(const QString& name) /// Non-virtual functions declared in FinalRenderEmberController. /// +/// +/// Return either m_Renderer in the case of running a CPU renderer, else +/// m_Renderers[0] in the case of running OpenCL. +/// +/// The primary renderer +template +EmberNs::Renderer* FinalRenderEmberController::FirstOrDefaultRenderer() +{ + if (m_Renderer.get()) + return dynamic_cast*>(m_Renderer.get()); + else if (!m_Renderers.empty()) + return dynamic_cast*>(m_Renderers[0].get()); + else + { + throw "No final renderer, exiting."; + return nullptr; + } +} + /// /// Stop the preview renderer. /// This is meant to only be called programatically and never by the user. @@ -645,40 +798,80 @@ void FinalRenderEmberController::CancelPreviewRender() while (m_FinalPreviewResult.isRunning()) { QApplication::processEvents(); } } +/// +/// Save the output of the render. +/// +/// The ember whose rendered output will be saved +template +void FinalRenderEmberController::SaveCurrentRender(Ember& ember) +{ + auto comments = m_Renderer->ImageComments(m_Stats, 0, false, true); + SaveCurrentRender(ember, comments, m_FinalImage, m_Renderer->FinalRasW(), m_Renderer->FinalRasH(), m_Renderer->NumChannels(), m_Renderer->BytesPerChannel()); +} + +/// +/// Save the output of the render. +/// +/// The ember whose rendered output will be saved +/// The comments to save in the png or jpg +/// The buffer containing the pixels +/// The width in pixels of the image +/// The height in pixels of the image +/// The number of channels, 3 or 4. +/// The bytes per channel, almost always 1. +template +void FinalRenderEmberController::SaveCurrentRender(Ember& ember, const EmberImageComments& comments, vector& pixels, size_t width, size_t height, size_t channels, size_t bpc) +{ + QString filename = ComposePath(QString::fromStdString(ember.m_Name)); + FractoriumEmberControllerBase::SaveCurrentRender(filename, comments, pixels, width, height, channels, bpc); +} + /// /// Action to take when rendering an image completes. +/// Thin wrapper around the function of the same name that takes more arguments. +/// Just passes m_Renderer and m_FinalImage. /// /// The ember currently being rendered template void FinalRenderEmberController::RenderComplete(Ember& ember) { - string renderTimeString = m_RenderTimer.Format(m_RenderTimer.Toc()), totalTimeString; + if (auto renderer = dynamic_cast*>(m_Renderer.get())) + RenderComplete(ember, m_Stats, m_RenderTimer); +} + +/// +/// Handle setting the appropriate progress bar values when an image render has finished. +/// This handles single image, animations, and strips. +/// +template +void FinalRenderEmberController::HandleFinishedProgress() +{ + auto finishedCountCached = m_FinishedImageCount.load();//Make sure to use the same value throughout this function even if the atomic is changing. + + if (m_FinishedImageCount.load() != m_ImageCount) + ResetProgress(false); + else + SetProgressComplete(100);//Just to be safe. + + QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderTotalProgress, "setValue", Qt::QueuedConnection, Q_ARG(int, int((float(finishedCountCached) / float(m_ImageCount)) * 100))); + QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderImageCountLabel, "setText", Qt::QueuedConnection, Q_ARG(const QString&, ToString(finishedCountCached) + " / " + ToString(m_ImageCount))); +} + +/// +/// Action to take when rendering an image completes. +/// +/// The ember currently being rendered +/// The renderer stats +/// The timer which was started at the beginning of the render +template +void FinalRenderEmberController::RenderComplete(Ember& ember, const EmberStats& stats, Timing& renderTimer) +{ + m_ProgressCs.Enter(); + string renderTimeString = renderTimer.Format(renderTimer.Toc()), totalTimeString; QString status, filename = ComposePath(QString::fromStdString(ember.m_Name)); - QString itersString = ToString(m_Stats.m_Iters); - QString itersPerSecString = ToString(size_t(m_Stats.m_Iters / (m_Stats.m_IterMs / 1000.0))); - - //Save whatever options were specified on the GUI to the settings. - m_Settings->FinalEarlyClip(m_GuiState.m_EarlyClip); - m_Settings->FinalYAxisUp(m_GuiState.m_YAxisUp); - m_Settings->FinalTransparency(m_GuiState.m_Transparency); - m_Settings->FinalOpenCL(m_GuiState.m_OpenCL); - m_Settings->FinalDouble(m_GuiState.m_Double); - m_Settings->FinalPlatformIndex(m_GuiState.m_PlatformIndex); - m_Settings->FinalDeviceIndex(m_GuiState.m_DeviceIndex); - m_Settings->FinalSaveXml(m_GuiState.m_SaveXml); - m_Settings->FinalDoAll(m_GuiState.m_DoAll); - m_Settings->FinalDoSequence(m_GuiState.m_DoSequence); - m_Settings->FinalKeepAspect(m_GuiState.m_KeepAspect); - m_Settings->FinalScale(m_GuiState.m_Scale); - m_Settings->FinalExt(m_GuiState.m_Ext); - m_Settings->FinalThreadCount(m_GuiState.m_ThreadCount); - m_Settings->FinalThreadPriority(m_GuiState.m_ThreadPriority); - m_Settings->FinalQuality(m_GuiState.m_Quality); - m_Settings->FinalTemporalSamples(m_GuiState.m_TemporalSamples); - m_Settings->FinalSupersample(m_GuiState.m_Supersample); - m_Settings->FinalStrips(m_GuiState.m_Strips); - SaveCurrentRender(filename, false);//Don't pull from the card, the rendering process already did it. - + QString itersString = ToString(stats.m_Iters); + QString itersPerSecString = ToString(size_t(stats.m_Iters / (stats.m_IterMs / 1000.0))); + if (m_GuiState.m_SaveXml) { QFileInfo xmlFileInfo(filename);//Create another one in case it was modified for batch rendering. @@ -692,31 +885,38 @@ void FinalRenderEmberController::RenderComplete(Ember& ember) xmlFreeDoc(tempEdit); } - m_FinishedImageCount++; - - //In a thread if animating, so don't set to complete because it'll be out of sync with the rest of the progress bars. - if (!m_GuiState.m_DoSequence) - { - SetProgressComplete(100);//Just to be safe. - } - - QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderTotalProgress, "setValue", Qt::QueuedConnection, Q_ARG(int, int((float(m_FinishedImageCount) / float(m_ImageCount)) * 100))); - QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderImageCountLabel, "setText", Qt::QueuedConnection, Q_ARG(const QString&, ToString(m_FinishedImageCount) + " / " + ToString(m_ImageCount))); - status = "Pure render time: " + QString::fromStdString(renderTimeString); Output(status); - totalTimeString = m_RenderTimer.Format(m_RenderTimer.Toc()); + totalTimeString = renderTimer.Format(renderTimer.Toc()); status = "Total time: " + QString::fromStdString(totalTimeString) + "\nTotal iters: " + itersString + "\nIters/second: " + itersPerSecString + "\n"; Output(status); QMetaObject::invokeMethod(m_FinalRenderDialog, "MoveCursorToEnd", Qt::QueuedConnection); - if (m_FinishedImageCount != m_ImageCount) + if (m_FinishedImageCount.load() == m_ImageCount)//Finished, save whatever options were specified on the GUI to the settings. { - ResetProgress(false); + m_Settings->FinalEarlyClip(m_GuiState.m_EarlyClip); + m_Settings->FinalYAxisUp(m_GuiState.m_YAxisUp); + m_Settings->FinalTransparency(m_GuiState.m_Transparency); + m_Settings->FinalOpenCL(m_GuiState.m_OpenCL); + m_Settings->FinalDouble(m_GuiState.m_Double); + m_Settings->FinalDevices(m_GuiState.m_Devices); + m_Settings->FinalSaveXml(m_GuiState.m_SaveXml); + m_Settings->FinalDoAll(m_GuiState.m_DoAll); + m_Settings->FinalDoSequence(m_GuiState.m_DoSequence); + m_Settings->FinalKeepAspect(m_GuiState.m_KeepAspect); + m_Settings->FinalScale(m_GuiState.m_Scale); + m_Settings->FinalExt(m_GuiState.m_Ext); + m_Settings->FinalThreadCount(m_GuiState.m_ThreadCount); + m_Settings->FinalThreadPriority(m_GuiState.m_ThreadPriority); + m_Settings->FinalQuality(m_GuiState.m_Quality); + m_Settings->FinalTemporalSamples(m_GuiState.m_TemporalSamples); + m_Settings->FinalSupersample(m_GuiState.m_Supersample); + m_Settings->FinalStrips(m_GuiState.m_Strips); } QMetaObject::invokeMethod(m_FinalRenderDialog->ui.FinalRenderTextOutput, "update", Qt::QueuedConnection); + m_ProgressCs.Leave(); } /// diff --git a/Source/Fractorium/FinalRenderEmberController.h b/Source/Fractorium/FinalRenderEmberController.h index 41233a1..7cacf24 100644 --- a/Source/Fractorium/FinalRenderEmberController.h +++ b/Source/Fractorium/FinalRenderEmberController.h @@ -35,8 +35,7 @@ struct FinalRenderGuiState QString m_Ext; QString m_Prefix; QString m_Suffix; - uint m_PlatformIndex; - uint m_DeviceIndex; + QList m_Devices; uint m_ThreadCount; int m_ThreadPriority; double m_WidthScale; @@ -68,16 +67,16 @@ public: virtual tuple SyncAndComputeMemory() { return tuple(0, 0, 0); } virtual double OriginalAspect() { return 1; } virtual QString ComposePath(const QString& name) { return ""; } + virtual void CancelRender() { } - void CancelRender(); bool CreateRendererFromGUI(); void Output(const QString& s); protected: bool m_Run; bool m_PreviewRun; - uint m_ImageCount; - uint m_FinishedImageCount; + size_t m_ImageCount; + std::atomic m_FinishedImageCount; QFuture m_Result; QFuture m_FinalPreviewResult; @@ -87,15 +86,14 @@ protected: FractoriumSettings* m_Settings; FractoriumFinalRenderDialog* m_FinalRenderDialog; FinalRenderGuiState m_GuiState; - OpenCLWrapper m_Wrapper; - CriticalSection m_PreviewCs; + CriticalSection m_PreviewCs, m_ProgressCs; Timing m_RenderTimer; Timing m_TotalTimer; }; /// /// Templated derived class which implements all interaction functionality between the embers -/// of a specific template type and the final render dialog; +/// of a specific template type and the final render dialog. /// template class FinalRenderEmberController : public FinalRenderEmberControllerBase @@ -113,7 +111,7 @@ public: #endif virtual void SetEmber(size_t index) override; virtual bool Render() override; - virtual bool CreateRenderer(eRendererType renderType, uint platform, uint device, bool shared = true) override; + virtual bool CreateRenderer(eRendererType renderType, const vector>& devices, bool shared = true) override; virtual int ProgressFunc(Ember& ember, void* foo, double fraction, int stage, double etaMs) override; virtual size_t Index() const override { return m_Ember->m_Index; } virtual uint SizeOfT() const override { return sizeof(T); } @@ -127,11 +125,20 @@ public: virtual double OriginalAspect() override { return double(m_Ember->m_OrigFinalRasW) / m_Ember->m_OrigFinalRasH; } virtual QString Name() const override { return QString::fromStdString(m_Ember->m_Name); } virtual QString ComposePath(const QString& name) override; - + virtual void CancelRender() override; + + //Non Virtual functions. + EmberNs::Renderer* FirstOrDefaultRenderer(); + protected: void CancelPreviewRender(); + void HandleFinishedProgress(); + void SaveCurrentRender(Ember& ember); + void SaveCurrentRender(Ember& ember, const EmberImageComments& comments, vector& pixels, size_t width, size_t height, size_t channels, size_t bpc); void RenderComplete(Ember& ember); + void RenderComplete(Ember& ember, const EmberStats& stats, Timing& renderTimer); void SyncGuiToEmber(Ember& ember, size_t widthOverride = 0, size_t heightOverride = 0); + bool SyncGuiToRenderer(); void SetProgressComplete(int val); Ember* m_Ember; @@ -139,5 +146,6 @@ protected: EmberFile m_EmberFile; EmberToXml m_XmlWriter; unique_ptr> m_FinalPreviewRenderer; + vector>> m_Renderers; }; diff --git a/Source/Fractorium/Fractorium.cpp b/Source/Fractorium/Fractorium.cpp index 8c8ba93..8cad6df 100644 --- a/Source/Fractorium/Fractorium.cpp +++ b/Source/Fractorium/Fractorium.cpp @@ -14,7 +14,8 @@ /// /// The parent widget of this item Fractorium::Fractorium(QWidget* p) - : QMainWindow(p) + : QMainWindow(p), + m_Info(OpenCLInfo::Instance()) { int spinHeight = 20, iconSize_ = 9; size_t i = 0; @@ -58,9 +59,6 @@ Fractorium::Fractorium(QWidget* p) const QRect screen = QApplication::desktop()->screenGeometry(); m_AboutDialog->move(screen.center() - m_AboutDialog->rect().center()); - //The options dialog should be a fixed size without a size grip, however even if it's here, it still shows up. Perhaps Qt will fix it some day. - m_OptionsDialog->layout()->setSizeConstraint(QLayout::SetFixedSize); - m_OptionsDialog->setSizeGripEnabled(false); connect(m_ColorDialog, SIGNAL(colorSelected(const QColor&)), this, SLOT(OnColorSelected(const QColor&)), Qt::QueuedConnection); m_XformComboColors[i++] = QColor(0XFF, 0X00, 0X00); @@ -119,8 +117,8 @@ Fractorium::Fractorium(QWidget* p) m_Controller->SetupVariationTree(); m_Controller->FilteredVariations(); - if (m_Wrapper.CheckOpenCL() && m_Settings->OpenCL() && m_QualitySpin->value() < 30) - m_QualitySpin->setValue(30); + if (m_Info.Ok() && m_Settings->OpenCL() && m_QualitySpin->value() < (30 * m_Settings->Devices().size())) + m_QualitySpin->setValue(30 * m_Settings->Devices().size()); int statusBarHeight = 20 * devicePixelRatio(); ui.statusBar->setMinimumHeight(statusBarHeight); diff --git a/Source/Fractorium/Fractorium.h b/Source/Fractorium/Fractorium.h index 6de433b..d94d379 100644 --- a/Source/Fractorium/Fractorium.h +++ b/Source/Fractorium/Fractorium.h @@ -369,7 +369,7 @@ private: void ShutdownAndRecreateFromOptions(); bool CreateRendererFromOptions(); bool CreateControllerFromOptions(); - + //Dialogs. QStringList SetupOpenXmlDialog(); QString SetupSaveXmlDialog(const QString& defaultFilename); @@ -495,7 +495,7 @@ private: int m_VarSortMode; int m_PaletteSortMode; int m_PreviousPaletteRow; - OpenCLWrapper m_Wrapper; + OpenCLInfo& m_Info; unique_ptr m_Controller; Ui::FractoriumClass ui; }; diff --git a/Source/Fractorium/FractoriumCommon.h b/Source/Fractorium/FractoriumCommon.h index 0e318ce..fbe7516 100644 --- a/Source/Fractorium/FractoriumCommon.h +++ b/Source/Fractorium/FractoriumCommon.h @@ -185,4 +185,175 @@ static intmax_t IsXformLinked(Ember& ember, Xform* xform) } return linked; -} \ No newline at end of file +} + +/// +/// Convert the passed in QList of absolute device indices to a vector> of platform,device +/// index pairs. +/// +/// The absolute device indices +/// The converted device vector of platform,device index pairs +static vector> Devices(const QList& selectedDevices) +{ + vector> vec; + OpenCLInfo& info = OpenCLInfo::Instance(); + auto& devices = info.DeviceIndices(); + + vec.reserve(selectedDevices.size()); + + for (size_t i = 0; i < selectedDevices.size(); i++) + { + auto index = selectedDevices[i].toUInt(); + + if (index < devices.size()) + vec.push_back(devices[index]); + } + + return vec; +} + +/// +/// Setup a table showing all available OpenCL devices on the system. +/// Create checkboxes and radio buttons which allow the user to specify +/// which devices to use, and which one to make the primary device. +/// Used in the options dialog and the final render dialog. +/// +/// The QTableWidget to setup +/// The absolute indices of the devices to use, with the first being the primary. +static void SetupDeviceTable(QTableWidget* table, const QList& settingsDevices) +{ + bool primary = false; + auto& deviceNames = OpenCLInfo::Instance().AllDeviceNames(); + + table->clearContents(); + table->setRowCount(deviceNames.size()); + + for (size_t i = 0; i < deviceNames.size(); i++) + { + auto checkItem = new QTableWidgetItem(); + auto radio = new QRadioButton(); + auto deviceItem = new QTableWidgetItem(QString::fromStdString(deviceNames[i])); + + table->setItem(i, 0, checkItem); + table->setCellWidget(i, 1, radio); + table->setItem(i, 2, deviceItem); + + if (settingsDevices.contains(QVariant::fromValue(i))) + { + checkItem->setCheckState(Qt::Checked); + + if (!primary) + { + radio->setChecked(true); + primary = true; + } + } + else + checkItem->setCheckState(Qt::Unchecked); + } + + if (!primary && table->rowCount() > 0)//Primary was never set, so just default to the first device and hope it was the one detected as the main display. + { + table->item(0, 0)->setCheckState(Qt::Checked); + qobject_cast(table->cellWidget(0, 1))->setChecked(true); + } +} + +/// +/// Copy the passed in selected absolute device indices to the controls on the passed in table. +/// Used in the options dialog and the final render dialog. +/// +/// The QTableWidget to copy values to +/// The absolute indices of the devices to use, with the first being the primary. +static void SettingsToDeviceTable(QTableWidget* table, QList& settingsDevices) +{ + if (settingsDevices.empty() && table->rowCount() > 0) + { + table->item(0, 0)->setCheckState(Qt::Checked); + qobject_cast(table->cellWidget(0, 1))->setChecked(true); + + for (int row = 1; row < table->rowCount(); row++) + if (auto item = table->item(row, 0)) + item->setCheckState(Qt::Unchecked); + } + else + { + for (int row = 0; row < table->rowCount(); row++) + { + if (auto item = table->item(row, 0)) + { + if (settingsDevices.contains(row)) + { + item->setCheckState(Qt::Checked); + + if (!settingsDevices.indexOf(QVariant::fromValue(row))) + if (auto radio = qobject_cast(table->cellWidget(row, 1))) + radio->setChecked(true); + } + else + { + item->setCheckState(Qt::Unchecked); + } + } + } + } +} + +/// +/// Copy the values of the controls on the passed in table to a list of absolute device indices. +/// Used in the options dialog and the final render dialog. +/// +/// The QTableWidget to copy values from +/// The list of absolute device indices +static QList DeviceTableToSettings(QTableWidget* table) +{ + QList devices; + auto rows = table->rowCount(); + + for (int row = 0; row < rows; row++) + { + auto checkItem = table->item(row, 0); + auto radio = qobject_cast(table->cellWidget(row, 1)); + + if (checkItem->checkState() == Qt::Checked) + { + if (radio && radio->isChecked()) + devices.push_front(row); + else + devices.push_back(row); + } + } + + return devices; +} + +/// +/// Ensure device selection on the passed in table make sense. +/// +/// The QTableWidget to setup +/// The row of the cell +/// The column of the cell +static void HandleDeviceTableCheckChanged(QTableWidget* table, int row, int col) +{ + int primaryRow = -1; + QRadioButton* primaryRadio = nullptr; + + for (int i = 0; i < table->rowCount(); i++) + { + if (auto radio = qobject_cast(table->cellWidget(i, 1))) + { + if (radio->isChecked()) + { + primaryRow = i; + primaryRadio = radio; + break; + } + } + } + + if (primaryRow == -1) primaryRow = 0; + + if (auto primaryItem = table->item(primaryRow, 0)) + if (primaryItem->checkState() == Qt::Unchecked) + primaryItem->setCheckState(Qt::Checked); +} diff --git a/Source/Fractorium/FractoriumEmberController.cpp b/Source/Fractorium/FractoriumEmberController.cpp index c3c0aa2..3e4d5b3 100644 --- a/Source/Fractorium/FractoriumEmberController.cpp +++ b/Source/Fractorium/FractoriumEmberController.cpp @@ -11,19 +11,17 @@ /// /// Pointer to the main window. FractoriumEmberControllerBase::FractoriumEmberControllerBase(Fractorium* fractorium) + : m_Info(OpenCLInfo::Instance()) { Timing t; m_Rendering = false; m_Shared = true; - m_Platform = 0; - m_Device = 0; m_FailedRenders = 0; m_UndoIndex = 0; m_RenderType = CPU_RENDERER; m_OutputTexID = 0; m_SubBatchCount = 1;//Will be ovewritten by the options on first render. - m_FinalImageIndex = 0; m_Fractorium = fractorium; m_RenderTimer = nullptr; m_RenderRestartTimer = nullptr; diff --git a/Source/Fractorium/FractoriumEmberController.h b/Source/Fractorium/FractoriumEmberController.h index 152f048..167f2f7 100644 --- a/Source/Fractorium/FractoriumEmberController.h +++ b/Source/Fractorium/FractoriumEmberController.h @@ -72,16 +72,16 @@ public: virtual size_t TotalXformCount() const { return 0; } virtual QString Name() const { return ""; } virtual void Name(const string& s) { } - virtual uint FinalRasW() const { return 0; } - virtual void FinalRasW(uint w) { } - virtual uint FinalRasH() const { return 0; } - virtual void FinalRasH(uint h) { } + virtual size_t FinalRasW() const { return 0; } + virtual void FinalRasW(size_t w) { } + virtual size_t FinalRasH() const { return 0; } + virtual void FinalRasH(size_t h) { } virtual size_t Index() const { return 0; } virtual void AddSymmetry(int sym, QTIsaac& rand) { } virtual void CalcNormalizedWeights() { } //Menu. - virtual void NewFlock(uint count) { }//File. + virtual void NewFlock(size_t count) { }//File. virtual void NewEmptyFlameInCurrentFile() { } virtual void NewRandomFlameInCurrentFile() { } virtual void CopyFlameInCurrentFile() { } @@ -209,7 +209,7 @@ public: //Rendering/progress. virtual bool Render() { return false; } - virtual bool CreateRenderer(eRendererType renderType, uint platform, uint device, bool shared = true) { return false; } + virtual bool CreateRenderer(eRendererType renderType, const vector>& devices, bool shared = true) { return false; } virtual uint SizeOfT() const { return 0; } virtual void ClearUndo() { } virtual GLEmberControllerBase* GLController() { return nullptr; } @@ -221,10 +221,11 @@ public: void Shutdown(); void UpdateRender(eProcessAction action = FULL_RENDER); void DeleteRenderer(); - void SaveCurrentRender(const QString& filename, bool forcePull); + void SaveCurrentRender(const QString& filename, const EmberImageComments& comments, vector& pixels, size_t width, size_t height, size_t channels, size_t bpc); RendererBase* Renderer() { return m_Renderer.get(); } - vector* FinalImage() { return &(m_FinalImage[m_FinalImageIndex]); } + vector* FinalImage() { return &(m_FinalImage); } vector* PreviewFinalImage() { return &m_PreviewFinalImage; } + EmberStats Stats() { return m_Stats; } protected: //Rendering/progress. @@ -236,9 +237,7 @@ protected: bool m_Rendering; bool m_Shared; bool m_LastEditWasUndoRedo; - uint m_FinalImageIndex; - uint m_Platform; - uint m_Device; + vector> m_Devices; uint m_SubBatchCount; uint m_FailedRenders; uint m_UndoIndex; @@ -253,7 +252,7 @@ protected: string m_CurrentPaletteFilePath; CriticalSection m_Cs; std::thread m_WriteThread; - vector m_FinalImage[2]; + vector m_FinalImage; vector m_PreviewFinalImage; vector m_ProcessActions; vector m_FilteredVariations; @@ -262,6 +261,7 @@ protected: Fractorium* m_Fractorium; QTimer* m_RenderTimer; QTimer* m_RenderRestartTimer; + OpenCLInfo& m_Info; }; /// @@ -306,10 +306,10 @@ public: virtual size_t TotalXformCount() const override { return m_Ember.TotalXformCount(); } virtual QString Name() const override { return QString::fromStdString(m_Ember.m_Name); } virtual void Name(const string& s) override { m_Ember.m_Name = s; } - virtual uint FinalRasW() const override { return m_Ember.m_FinalRasW; } - virtual void FinalRasW(uint w) override { m_Ember.m_FinalRasW = w; } - virtual uint FinalRasH() const override { return m_Ember.m_FinalRasH; } - virtual void FinalRasH(uint h) override { m_Ember.m_FinalRasH = h; } + virtual size_t FinalRasW() const override { return m_Ember.m_FinalRasW; } + virtual void FinalRasW(size_t w) override { m_Ember.m_FinalRasW = w; } + virtual size_t FinalRasH() const override { return m_Ember.m_FinalRasH; } + virtual void FinalRasH(size_t h) override { m_Ember.m_FinalRasH = h; } virtual size_t Index() const override { return m_Ember.m_Index; } virtual void AddSymmetry(int sym, QTIsaac& rand) override { m_Ember.AddSymmetry(sym, rand); } virtual void CalcNormalizedWeights() override { m_Ember.CalcNormalizedWeights(m_NormalizedWeights); } @@ -317,7 +317,7 @@ public: Ember* CurrentEmber(); //Menu. - virtual void NewFlock(uint count) override; + virtual void NewFlock(size_t count) override; virtual void NewEmptyFlameInCurrentFile() override; virtual void NewRandomFlameInCurrentFile() override; virtual void CopyFlameInCurrentFile() override; @@ -447,7 +447,7 @@ public: //Rendering/progress. virtual bool Render() override; - virtual bool CreateRenderer(eRendererType renderType, uint platform, uint device, bool shared = true) override; + virtual bool CreateRenderer(eRendererType renderType, const vector>& devices, bool shared = true) override; virtual uint SizeOfT() const override { return sizeof(T); } virtual int ProgressFunc(Ember& ember, void* foo, double fraction, int stage, double etaMs) override; virtual void ClearUndo() override; diff --git a/Source/Fractorium/FractoriumInfo.cpp b/Source/Fractorium/FractoriumInfo.cpp index 69374a3..ca99b6d 100644 --- a/Source/Fractorium/FractoriumInfo.cpp +++ b/Source/Fractorium/FractoriumInfo.cpp @@ -33,7 +33,7 @@ void Fractorium::InitInfoUI() /// Ignored void Fractorium::OnSummaryTableHeaderResized(int logicalIndex, int oldSize, int newSize) { - QPixmap pixmap = QPixmap::fromImage(m_Controller->FinalPaletteImage());//Create a QPixmap out of the QImage. + QPixmap pixmap = QPixmap::fromImage(m_Controller->FinalPaletteImage());//Create a QPixmap out of the QImage, will be empty on startup. SetPaletteTableItem(&pixmap, ui.SummaryTableWidget, m_InfoPaletteItem, 1, 0); } diff --git a/Source/Fractorium/FractoriumLibrary.cpp b/Source/Fractorium/FractoriumLibrary.cpp index 789ac72..e595440 100644 --- a/Source/Fractorium/FractoriumLibrary.cpp +++ b/Source/Fractorium/FractoriumLibrary.cpp @@ -240,7 +240,7 @@ void FractoriumEmberController::EmberTreeItemChanged(QTreeWidgetItem* item, i } } } - catch(std::exception& e) + catch(const std::exception& e) { qDebug() << "FractoriumEmberController::EmberTreeItemChanged() : Exception thrown: " << e.what(); } diff --git a/Source/Fractorium/FractoriumMenus.cpp b/Source/Fractorium/FractoriumMenus.cpp index 534ffc7..d44dc8d 100644 --- a/Source/Fractorium/FractoriumMenus.cpp +++ b/Source/Fractorium/FractoriumMenus.cpp @@ -19,13 +19,13 @@ void Fractorium::InitMenusUI() connect(ui.ActionExit, SIGNAL(triggered(bool)), this, SLOT(OnActionExit(bool)), Qt::QueuedConnection); //Edit menu. - connect(ui.ActionUndo, SIGNAL(triggered(bool)), this, SLOT(OnActionUndo(bool)), Qt::QueuedConnection); - connect(ui.ActionRedo, SIGNAL(triggered(bool)), this, SLOT(OnActionRedo(bool)), Qt::QueuedConnection); - connect(ui.ActionCopyXml, SIGNAL(triggered(bool)), this, SLOT(OnActionCopyXml(bool)), Qt::QueuedConnection); - connect(ui.ActionCopyAllXml, SIGNAL(triggered(bool)), this, SLOT(OnActionCopyAllXml(bool)), Qt::QueuedConnection); - connect(ui.ActionPasteXmlAppend, SIGNAL(triggered(bool)), this, SLOT(OnActionPasteXmlAppend(bool)), Qt::QueuedConnection); - connect(ui.ActionPasteXmlOver, SIGNAL(triggered(bool)), this, SLOT(OnActionPasteXmlOver(bool)), Qt::QueuedConnection); - connect(ui.ActionCopySelectedXforms, SIGNAL(triggered(bool)), this, SLOT(OnActionCopySelectedXforms(bool)), Qt::QueuedConnection); + connect(ui.ActionUndo, SIGNAL(triggered(bool)), this, SLOT(OnActionUndo(bool)), Qt::QueuedConnection); + connect(ui.ActionRedo, SIGNAL(triggered(bool)), this, SLOT(OnActionRedo(bool)), Qt::QueuedConnection); + connect(ui.ActionCopyXml, SIGNAL(triggered(bool)), this, SLOT(OnActionCopyXml(bool)), Qt::QueuedConnection); + connect(ui.ActionCopyAllXml, SIGNAL(triggered(bool)), this, SLOT(OnActionCopyAllXml(bool)), Qt::QueuedConnection); + connect(ui.ActionPasteXmlAppend, SIGNAL(triggered(bool)), this, SLOT(OnActionPasteXmlAppend(bool)), Qt::QueuedConnection); + connect(ui.ActionPasteXmlOver, SIGNAL(triggered(bool)), this, SLOT(OnActionPasteXmlOver(bool)), Qt::QueuedConnection); + connect(ui.ActionCopySelectedXforms, SIGNAL(triggered(bool)), this, SLOT(OnActionCopySelectedXforms(bool)), Qt::QueuedConnection); connect(ui.ActionPasteSelectedXforms, SIGNAL(triggered(bool)), this, SLOT(OnActionPasteSelectedXforms(bool)), Qt::QueuedConnection); ui.ActionPasteSelectedXforms->setEnabled(false); @@ -54,7 +54,7 @@ void Fractorium::InitMenusUI() /// /// The number of embers to include in the flock template -void FractoriumEmberController::NewFlock(uint count) +void FractoriumEmberController::NewFlock(size_t count) { Ember ember; @@ -63,12 +63,12 @@ void FractoriumEmberController::NewFlock(uint count) m_EmberFile.m_Embers.reserve(count); m_EmberFile.m_Filename = EmberFile::DefaultFilename(); - for (uint i = 0; i < count; i++) + for (size_t i = 0; i < count; i++) { - m_SheepTools->Random(ember, m_FilteredVariations, static_cast(QTIsaac::GlobalRand->Frand(-2, 2)), 0); + m_SheepTools->Random(ember, m_FilteredVariations, static_cast(QTIsaac::GlobalRand->Frand(-2, 2)), 0); ParamsToEmber(ember); ember.m_Index = i; - ember.m_Name = m_EmberFile.m_Filename.toStdString() + "-" + ToString(i + 1).toStdString(); + ember.m_Name = m_EmberFile.m_Filename.toStdString() + "-" + ToString(i + 1ULL).toStdString(); m_EmberFile.m_Embers.push_back(ember); } @@ -353,8 +353,22 @@ void Fractorium::OnActionSaveEntireFileAsXml(bool checked) { m_Controller->SaveE void Fractorium::OnActionSaveCurrentScreen(bool checked) { QString filename = SetupSaveImageDialog(m_Controller->Name()); + auto renderer = m_Controller->Renderer(); + auto& pixels = *m_Controller->FinalImage(); + RendererCLBase* rendererCL = dynamic_cast(m_Controller->Renderer()); + auto stats = m_Controller->Stats(); + EmberImageComments comments = renderer->ImageComments(stats, 0, false, true); - m_Controller->SaveCurrentRender(filename, true); + if (rendererCL && renderer->PrepFinalAccumVector(pixels)) + { + if (!rendererCL->ReadFinal(pixels.data())) + { + ShowCritical("GPU Read Error", "Could not read image from the GPU, aborting image save.", false); + return; + } + } + + m_Controller->SaveCurrentRender(filename, comments, pixels, renderer->FinalRasW(), renderer->FinalRasH(), renderer->NumChannels(), renderer->BytesPerChannel()); } /// diff --git a/Source/Fractorium/FractoriumPalette.cpp b/Source/Fractorium/FractoriumPalette.cpp index 678a0af..de9f1f1 100644 --- a/Source/Fractorium/FractoriumPalette.cpp +++ b/Source/Fractorium/FractoriumPalette.cpp @@ -341,7 +341,7 @@ void Fractorium::OnPaletteFilterLineEditTextChanged(const QString& text) table->setUpdatesEnabled(false); - for (uint i = 0; i < uint(table->rowCount()); i++) + for (int i = 0; i < table->rowCount(); i++) { if (auto item = table->item(i, 0)) { diff --git a/Source/Fractorium/FractoriumRender.cpp b/Source/Fractorium/FractoriumRender.cpp index e4ef869..371eb25 100644 --- a/Source/Fractorium/FractoriumRender.cpp +++ b/Source/Fractorium/FractoriumRender.cpp @@ -78,8 +78,7 @@ void FractoriumEmberControllerBase::Shutdown() /// void FractoriumEmberControllerBase::ClearFinalImages() { - Memset(m_FinalImage[0]); - Memset(m_FinalImage[1]); + Memset(m_FinalImage); //Unsure if we should also call RendererCL::ClearFinal() as well. At the moment it seems unnecessary. } @@ -114,48 +113,42 @@ void FractoriumEmberControllerBase::DeleteRenderer() /// /// Save the current render results to a file. -/// This could benefit from QImageWriter, however it's compression capabilities are +/// This could benefit from QImageWriter, however its compression capabilities are /// severely lacking. A Png file comes out larger than a bitmap, so instead use the /// Png and Jpg wrapper functions from the command line programs. /// This will embed the id, url and nick fields from the options in the image comments. /// /// The full path and filename -void FractoriumEmberControllerBase::SaveCurrentRender(const QString& filename, bool forcePull) +/// The comments to save in the png or jpg +/// The buffer containing the pixels +/// The width in pixels of the image +/// The height in pixels of the image +/// The number of channels, 3 or 4. +/// The bytes per channel, almost always 1. +void FractoriumEmberControllerBase::SaveCurrentRender(const QString& filename, const EmberImageComments& comments, vector& pixels, size_t width, size_t height, size_t channels, size_t bpc) { if (filename != "") { bool b = false; uint i, j; - uint width = m_Renderer->FinalRasW(); - uint height = m_Renderer->FinalRasH(); byte* data = nullptr; vector vecRgb; QFileInfo fileInfo(filename); QString suffix = fileInfo.suffix(); FractoriumSettings* settings = m_Fractorium->m_Settings; - RendererCLBase* rendererCL = dynamic_cast(m_Renderer.get()); - - if (forcePull && rendererCL && m_Renderer->PrepFinalAccumVector(m_FinalImage[m_FinalImageIndex])) - { - if (!rendererCL->ReadFinal(m_FinalImage[m_FinalImageIndex].data())) - { - m_Fractorium->ShowCritical("GPU Read Error", "Could not read image from the GPU, aborting image save.", true); - return; - } - } - + //Ensure dimensions are valid. - if (m_FinalImage[m_FinalImageIndex].size() < (width * height * m_Renderer->NumChannels() * m_Renderer->BytesPerChannel())) + if (pixels.size() < (width * height * channels * bpc)) { m_Fractorium->ShowCritical("Save Failed", "Dimensions didn't match, not saving.", true); return; } - data = m_FinalImage[m_FinalImageIndex].data();//Png and channels == 4. + data = pixels.data();//Png and channels == 4. - if ((suffix == "jpg" || suffix == "bmp") && m_Renderer->NumChannels() == 4) + if ((suffix == "jpg" || suffix == "bmp") && channels) { - RgbaToRgb(m_FinalImage[m_FinalImageIndex], vecRgb, width, height); + RgbaToRgb(pixels, vecRgb, width, height); data = vecRgb.data(); } @@ -164,7 +157,6 @@ void FractoriumEmberControllerBase::SaveCurrentRender(const QString& filename, b string id = settings->Id().toStdString(); string url = settings->Url().toStdString(); string nick = settings->Nick().toStdString(); - EmberImageComments comments = m_Renderer->ImageComments(m_Stats, 0, false, true); if (suffix == "png") b = WritePng(s.c_str(), data, width, height, 1, true, comments, id, url, nick); @@ -369,14 +361,14 @@ bool FractoriumEmberController::Render() if (ProcessState() != ACCUM_DONE) { //if (m_Renderer->Run(m_FinalImage, 0) == RENDER_OK)//Full, non-incremental render for debugging. - if (m_Renderer->Run(m_FinalImage[m_FinalImageIndex], 0, m_SubBatchCount, (iterBegin || m_Fractorium->m_Settings->ContinuousUpdate())) == RENDER_OK)//Force output on iterBegin or if the settings specify to always do it. + if (m_Renderer->Run(m_FinalImage, 0, m_SubBatchCount, (iterBegin || m_Fractorium->m_Settings->ContinuousUpdate())) == RENDER_OK)//Force output on iterBegin or if the settings specify to always do it. { //The amount to increment sub batch while rendering proceeds is purely empirical. //Change later if better values can be derived/observed. if (m_Renderer->RendererType() == OPENCL_RENDERER) { - if (m_SubBatchCount < 3)//More than 3 with OpenCL gives a sluggish UI. - m_SubBatchCount++; + if (m_SubBatchCount < (4 * m_Devices.size()))//More than 3 with OpenCL gives a sluggish UI. + m_SubBatchCount += m_Devices.size(); } else { @@ -445,7 +437,7 @@ bool FractoriumEmberController::Render() //Update it on finish because the rendering process is completely done. if (iterBegin || ProcessState() == ACCUM_DONE) { - if (m_FinalImage[m_FinalImageIndex].size() == m_Renderer->FinalBufferSize())//Make absolutely sure the correct amount of data is passed. + if (m_FinalImage.size() == m_Renderer->FinalBufferSize())//Make absolutely sure the correct amount of data is passed. gl->update(); //gl->repaint(); @@ -504,19 +496,17 @@ bool FractoriumEmberController::Render() /// Rendering will be left in a stopped state. The caller is responsible for restarting the render loop again. /// /// The type of render to create -/// The index platform of the platform to use -/// The index device of the device to use -/// The texture ID of the shared OpenGL texture if shared +/// The platform,device index pairs of the devices to use /// True if shared with OpenGL, else false. Default: true. /// True if nothing went wrong, else false. template -bool FractoriumEmberController::CreateRenderer(eRendererType renderType, uint platform, uint device, bool shared) +bool FractoriumEmberController::CreateRenderer(eRendererType renderType, const vector>& devices, bool shared) { bool ok = true; FractoriumSettings* s = m_Fractorium->m_Settings; GLWidget* gl = m_Fractorium->ui.GLDisplay; - if (!m_Renderer.get() || (m_Renderer->RendererType() != renderType) || (m_Platform != platform) || (m_Device != device)) + if (!m_Renderer.get() || (m_Renderer->RendererType() != renderType) || !Equal(m_Devices, devices)) { EmberReport emberReport; vector errorReport; @@ -524,13 +514,12 @@ bool FractoriumEmberController::CreateRenderer(eRendererType renderType, uint DeleteRenderer();//Delete the renderer and refresh the textures. //Before starting, must take care of allocations. gl->Allocate(true);//Forcing a realloc of the texture is necessary on AMD, but not on nVidia. - m_Renderer = unique_ptr(::CreateRenderer(renderType, platform, device, shared, gl->OutputTexID(), emberReport));//Always make bucket type float. + m_Renderer = unique_ptr(::CreateRenderer(renderType, devices, shared, gl->OutputTexID(), emberReport));//Always make bucket type float. errorReport = emberReport.ErrorReport(); if (errorReport.empty()) { - m_Platform = platform;//Store values for re-creation later on. - m_Device = device; + m_Devices = devices; m_OutputTexID = gl->OutputTexID(); m_Shared = shared; } @@ -548,11 +537,13 @@ bool FractoriumEmberController::CreateRenderer(eRendererType renderType, uint if (m_RenderType == OPENCL_RENDERER) { - m_Fractorium->m_QualitySpin->DoubleClickZero(30); - m_Fractorium->m_QualitySpin->DoubleClickNonZero(30); + auto val = 30 * m_Fractorium->m_Settings->Devices().size(); - if (m_Fractorium->m_QualitySpin->value() < 30) - m_Fractorium->m_QualitySpin->setValue(30); + m_Fractorium->m_QualitySpin->DoubleClickZero(val); + m_Fractorium->m_QualitySpin->DoubleClickNonZero(val); + + if (m_Fractorium->m_QualitySpin->value() < val) + m_Fractorium->m_QualitySpin->setValue(val); } else { @@ -618,12 +609,11 @@ void Fractorium::ShutdownAndRecreateFromOptions() bool Fractorium::CreateRendererFromOptions() { bool ok = true; - bool useOpenCL = m_Wrapper.CheckOpenCL() && m_Settings->OpenCL(); - + bool useOpenCL = m_Info.Ok() && m_Settings->OpenCL(); + auto v = Devices(m_Settings->Devices()); + //The most important option to process is what kind of renderer is desired, so do it first. - if (!m_Controller->CreateRenderer(useOpenCL ? OPENCL_RENDERER : CPU_RENDERER, - m_Settings->PlatformIndex(), - m_Settings->DeviceIndex())) + if (!m_Controller->CreateRenderer((useOpenCL && !v.empty()) ? OPENCL_RENDERER : CPU_RENDERER, v)) { //If using OpenCL, will only get here if creating RendererCL failed, but creating a backup CPU Renderer succeeded. ShowCritical("Renderer Creation Error", "Error creating renderer, most likely a GPU problem. Using CPU instead."); diff --git a/Source/Fractorium/FractoriumSettings.cpp b/Source/Fractorium/FractoriumSettings.cpp index 227f61e..efe1182 100644 --- a/Source/Fractorium/FractoriumSettings.cpp +++ b/Source/Fractorium/FractoriumSettings.cpp @@ -21,7 +21,7 @@ void FractoriumSettings::EnsureDefaults() FinalQuality(1000); if (FinalTemporalSamples() == 0) - FinalTemporalSamples(1000); + FinalTemporalSamples(100); if (FinalSupersample() == 0) FinalSupersample(2); @@ -30,7 +30,7 @@ void FractoriumSettings::EnsureDefaults() FinalStrips(1); if (XmlTemporalSamples() == 0) - XmlTemporalSamples(1000); + XmlTemporalSamples(100); if (XmlQuality() == 0) XmlQuality(1000); @@ -44,7 +44,7 @@ void FractoriumSettings::EnsureDefaults() if (FinalThreadCount() == 0 || FinalThreadCount() > Timing::ProcessorCount()) FinalThreadCount(Timing::ProcessorCount()); - FinalThreadPriority(Clamp((int)eThreadPriority::LOWEST, (int)eThreadPriority::HIGHEST, FinalThreadPriority())); + FinalThreadPriority(Clamp(FinalThreadPriority(), (int)eThreadPriority::LOWEST, (int)eThreadPriority::HIGHEST)); if (CpuSubBatch() < 1) CpuSubBatch(1); @@ -52,24 +52,9 @@ void FractoriumSettings::EnsureDefaults() if (OpenCLSubBatch() < 1) OpenCLSubBatch(1); - //There normally wouldn't be any more than 10 OpenCL platforms and devices - //on the system, so if a value greater than that is read, then the settings file - //was corrupted. - if (PlatformIndex() > 10) - PlatformIndex(0); - - if (DeviceIndex() > 10) - DeviceIndex(0); - if (FinalScale() > SCALE_HEIGHT) FinalScale(0); - if (FinalPlatformIndex() > 10) - FinalPlatformIndex(0); - - if (FinalDeviceIndex() > 10) - FinalDeviceIndex(0); - if (OpenXmlExt() == "") OpenXmlExt("Flame (*.flame)"); @@ -101,155 +86,149 @@ void FractoriumSettings::EnsureDefaults() /// Interactive renderer settings. /// -bool FractoriumSettings::EarlyClip() { return value(EARLYCLIP).toBool(); } -void FractoriumSettings::EarlyClip(bool b) { setValue(EARLYCLIP, b); } +bool FractoriumSettings::EarlyClip() { return value(EARLYCLIP).toBool(); } +void FractoriumSettings::EarlyClip(bool b) { setValue(EARLYCLIP, b); } + +bool FractoriumSettings::YAxisUp() { return value(YAXISUP).toBool(); } +void FractoriumSettings::YAxisUp(bool b) { setValue(YAXISUP, b); } + +bool FractoriumSettings::Transparency() { return value(TRANSPARENCY).toBool(); } +void FractoriumSettings::Transparency(bool b) { setValue(TRANSPARENCY, b); } + +bool FractoriumSettings::OpenCL() { return value(OPENCL).toBool(); } +void FractoriumSettings::OpenCL(bool b) { setValue(OPENCL, b); } + +bool FractoriumSettings::Double() { return value(DOUBLEPRECISION).toBool(); } +void FractoriumSettings::Double(bool b) { setValue(DOUBLEPRECISION, b); } + +bool FractoriumSettings::ShowAllXforms() { return value(SHOWALLXFORMS).toBool(); } +void FractoriumSettings::ShowAllXforms(bool b) { setValue(SHOWALLXFORMS, b); } + +bool FractoriumSettings::ContinuousUpdate() { return value(CONTUPDATE).toBool(); } +void FractoriumSettings::ContinuousUpdate(bool b) { setValue(CONTUPDATE, b); } -bool FractoriumSettings::YAxisUp() { return value(YAXISUP).toBool(); } -void FractoriumSettings::YAxisUp(bool b) { setValue(YAXISUP, b); } +QList FractoriumSettings::Devices() { return value(DEVICES).toList(); } +void FractoriumSettings::Devices(const QList& devices) { setValue(DEVICES, devices); } -bool FractoriumSettings::Transparency() { return value(TRANSPARENCY).toBool(); } -void FractoriumSettings::Transparency(bool b) { setValue(TRANSPARENCY, b); } - -bool FractoriumSettings::OpenCL() { return value(OPENCL).toBool(); } -void FractoriumSettings::OpenCL(bool b) { setValue(OPENCL, b); } - -bool FractoriumSettings::Double() { return value(DOUBLEPRECISION).toBool(); } -void FractoriumSettings::Double(bool b) { setValue(DOUBLEPRECISION, b); } - -bool FractoriumSettings::ShowAllXforms() { return value(SHOWALLXFORMS).toBool(); } -void FractoriumSettings::ShowAllXforms(bool b) { setValue(SHOWALLXFORMS, b); } - -bool FractoriumSettings::ContinuousUpdate() { return value(CONTUPDATE).toBool(); } -void FractoriumSettings::ContinuousUpdate(bool b) { setValue(CONTUPDATE, b); } - -uint FractoriumSettings::PlatformIndex() { return value(PLATFORMINDEX).toUInt(); } -void FractoriumSettings::PlatformIndex(uint i) { setValue(PLATFORMINDEX, i); } - -uint FractoriumSettings::DeviceIndex() { return value(DEVICEINDEX).toUInt(); } -void FractoriumSettings::DeviceIndex(uint i) { setValue(DEVICEINDEX, i); } - -uint FractoriumSettings::ThreadCount() { return value(THREADCOUNT).toUInt(); } -void FractoriumSettings::ThreadCount(uint i) { setValue(THREADCOUNT, i); } - -bool FractoriumSettings::CpuDEFilter() { return value(CPUDEFILTER).toBool(); } -void FractoriumSettings::CpuDEFilter(bool b) { setValue(CPUDEFILTER, b); } - -bool FractoriumSettings::OpenCLDEFilter() { return value(OPENCLDEFILTER).toBool(); } -void FractoriumSettings::OpenCLDEFilter(bool b) { setValue(OPENCLDEFILTER, b); } - -uint FractoriumSettings::CpuSubBatch() { return value(CPUSUBBATCH).toUInt(); } -void FractoriumSettings::CpuSubBatch(uint b) { setValue(CPUSUBBATCH, b); } - -uint FractoriumSettings::OpenCLSubBatch() { return value(OPENCLSUBBATCH).toUInt(); } -void FractoriumSettings::OpenCLSubBatch(uint b) { setValue(OPENCLSUBBATCH, b); } +uint FractoriumSettings::ThreadCount() { return value(THREADCOUNT).toUInt(); } +void FractoriumSettings::ThreadCount(uint i) { setValue(THREADCOUNT, i); } + +bool FractoriumSettings::CpuDEFilter() { return value(CPUDEFILTER).toBool(); } +void FractoriumSettings::CpuDEFilter(bool b) { setValue(CPUDEFILTER, b); } + +bool FractoriumSettings::OpenCLDEFilter() { return value(OPENCLDEFILTER).toBool(); } +void FractoriumSettings::OpenCLDEFilter(bool b) { setValue(OPENCLDEFILTER, b); } + +uint FractoriumSettings::CpuSubBatch() { return value(CPUSUBBATCH).toUInt(); } +void FractoriumSettings::CpuSubBatch(uint i) { setValue(CPUSUBBATCH, i); } + +uint FractoriumSettings::OpenCLSubBatch() { return value(OPENCLSUBBATCH).toUInt(); } +void FractoriumSettings::OpenCLSubBatch(uint i) { setValue(OPENCLSUBBATCH, i); } /// /// Final render settings. /// -bool FractoriumSettings::FinalEarlyClip() { return value(FINALEARLYCLIP).toBool(); } -void FractoriumSettings::FinalEarlyClip(bool b) { setValue(FINALEARLYCLIP, b); } - -bool FractoriumSettings::FinalYAxisUp() { return value(FINALYAXISUP).toBool(); } -void FractoriumSettings::FinalYAxisUp(bool b) { setValue(FINALYAXISUP, b); } - -bool FractoriumSettings::FinalTransparency() { return value(FINALTRANSPARENCY).toBool(); } -void FractoriumSettings::FinalTransparency(bool b) { setValue(FINALTRANSPARENCY, b); } - -bool FractoriumSettings::FinalOpenCL() { return value(FINALOPENCL).toBool(); } -void FractoriumSettings::FinalOpenCL(bool b) { setValue(FINALOPENCL, b); } +bool FractoriumSettings::FinalEarlyClip() { return value(FINALEARLYCLIP).toBool(); } +void FractoriumSettings::FinalEarlyClip(bool b) { setValue(FINALEARLYCLIP, b); } + +bool FractoriumSettings::FinalYAxisUp() { return value(FINALYAXISUP).toBool(); } +void FractoriumSettings::FinalYAxisUp(bool b) { setValue(FINALYAXISUP, b); } + +bool FractoriumSettings::FinalTransparency() { return value(FINALTRANSPARENCY).toBool(); } +void FractoriumSettings::FinalTransparency(bool b) { setValue(FINALTRANSPARENCY, b); } + +bool FractoriumSettings::FinalOpenCL() { return value(FINALOPENCL).toBool(); } +void FractoriumSettings::FinalOpenCL(bool b) { setValue(FINALOPENCL, b); } + +bool FractoriumSettings::FinalDouble() { return value(FINALDOUBLEPRECISION).toBool(); } +void FractoriumSettings::FinalDouble(bool b) { setValue(FINALDOUBLEPRECISION, b); } + +bool FractoriumSettings::FinalSaveXml() { return value(FINALSAVEXML).toBool(); } +void FractoriumSettings::FinalSaveXml(bool b) { setValue(FINALSAVEXML, b); } + +bool FractoriumSettings::FinalDoAll() { return value(FINALDOALL).toBool(); } +void FractoriumSettings::FinalDoAll(bool b) { setValue(FINALDOALL, b); } + +bool FractoriumSettings::FinalDoSequence() { return value(FINALDOSEQUENCE).toBool(); } +void FractoriumSettings::FinalDoSequence(bool b) { setValue(FINALDOSEQUENCE, b); } + +bool FractoriumSettings::FinalKeepAspect() { return value(FINALKEEPASPECT).toBool(); } +void FractoriumSettings::FinalKeepAspect(bool b) { setValue(FINALKEEPASPECT, b); } + +uint FractoriumSettings::FinalScale() { return value(FINALSCALE).toUInt(); } +void FractoriumSettings::FinalScale(uint i) { setValue(FINALSCALE, i); } + +QString FractoriumSettings::FinalExt() { return value(FINALEXT).toString(); } +void FractoriumSettings::FinalExt(const QString& s) { setValue(FINALEXT, s); } -bool FractoriumSettings::FinalDouble() { return value(FINALDOUBLEPRECISION).toBool(); } -void FractoriumSettings::FinalDouble(bool b) { setValue(FINALDOUBLEPRECISION, b); } +QList FractoriumSettings::FinalDevices() { return value(FINALDEVICES).toList(); } +void FractoriumSettings::FinalDevices(const QList& devices) { setValue(FINALDEVICES, devices); } -bool FractoriumSettings::FinalSaveXml() { return value(FINALSAVEXML).toBool(); } -void FractoriumSettings::FinalSaveXml(bool b) { setValue(FINALSAVEXML, b); } - -bool FractoriumSettings::FinalDoAll() { return value(FINALDOALL).toBool(); } -void FractoriumSettings::FinalDoAll(bool b) { setValue(FINALDOALL, b); } - -bool FractoriumSettings::FinalDoSequence() { return value(FINALDOSEQUENCE).toBool(); } -void FractoriumSettings::FinalDoSequence(bool b) { setValue(FINALDOSEQUENCE, b); } - -bool FractoriumSettings::FinalKeepAspect() { return value(FINALKEEPASPECT).toBool(); } -void FractoriumSettings::FinalKeepAspect(bool b) { setValue(FINALKEEPASPECT, b); } - -uint FractoriumSettings::FinalScale() { return value(FINALSCALE).toUInt(); } -void FractoriumSettings::FinalScale(uint i) { setValue(FINALSCALE, i); } - -QString FractoriumSettings::FinalExt() { return value(FINALEXT).toString(); } -void FractoriumSettings::FinalExt(const QString& s) { setValue(FINALEXT, s); } - -uint FractoriumSettings::FinalPlatformIndex() { return value(FINALPLATFORMINDEX).toUInt(); } -void FractoriumSettings::FinalPlatformIndex(uint i) { setValue(FINALPLATFORMINDEX, i); } - -uint FractoriumSettings::FinalDeviceIndex() { return value(FINALDEVICEINDEX).toUInt(); } -void FractoriumSettings::FinalDeviceIndex(uint i) { setValue(FINALDEVICEINDEX, i); } - -uint FractoriumSettings::FinalThreadCount() { return value(FINALTHREADCOUNT).toUInt(); } -void FractoriumSettings::FinalThreadCount(uint i) { setValue(FINALTHREADCOUNT, i); } - -uint FractoriumSettings::FinalThreadPriority() { return value(FINALTHREADPRIORITY).toInt(); } -void FractoriumSettings::FinalThreadPriority(int i) { setValue(FINALTHREADPRIORITY, i); } - -uint FractoriumSettings::FinalQuality() { return value(FINALQUALITY).toUInt(); } -void FractoriumSettings::FinalQuality(uint i) { setValue(FINALQUALITY, i); } - -uint FractoriumSettings::FinalTemporalSamples() { return value(FINALTEMPORALSAMPLES).toUInt(); } -void FractoriumSettings::FinalTemporalSamples(uint i) { setValue(FINALTEMPORALSAMPLES, i); } - -uint FractoriumSettings::FinalSupersample() { return value(FINALSUPERSAMPLE).toUInt(); } -void FractoriumSettings::FinalSupersample(uint i) { setValue(FINALSUPERSAMPLE, i); } - -uint FractoriumSettings::FinalStrips() { return value(FINALSTRIPS).toUInt(); } -void FractoriumSettings::FinalStrips(uint i) { setValue(FINALSTRIPS, i); } +uint FractoriumSettings::FinalThreadCount() { return value(FINALTHREADCOUNT).toUInt(); } +void FractoriumSettings::FinalThreadCount(uint i) { setValue(FINALTHREADCOUNT, i); } + +int FractoriumSettings::FinalThreadPriority() { return value(FINALTHREADPRIORITY).toInt(); } +void FractoriumSettings::FinalThreadPriority(int i) { setValue(FINALTHREADPRIORITY, i); } + +uint FractoriumSettings::FinalQuality() { return value(FINALQUALITY).toUInt(); } +void FractoriumSettings::FinalQuality(uint i) { setValue(FINALQUALITY, i); } + +uint FractoriumSettings::FinalTemporalSamples() { return value(FINALTEMPORALSAMPLES).toUInt(); } +void FractoriumSettings::FinalTemporalSamples(uint i) { setValue(FINALTEMPORALSAMPLES, i); } + +uint FractoriumSettings::FinalSupersample() { return value(FINALSUPERSAMPLE).toUInt(); } +void FractoriumSettings::FinalSupersample(uint i) { setValue(FINALSUPERSAMPLE, i); } + +uint FractoriumSettings::FinalStrips() { return value(FINALSTRIPS).toUInt(); } +void FractoriumSettings::FinalStrips(uint i) { setValue(FINALSTRIPS, i); } /// /// Xml file saving settings. /// -uint FractoriumSettings::XmlTemporalSamples() { return value(XMLTEMPORALSAMPLES).toUInt(); } -void FractoriumSettings::XmlTemporalSamples(uint i) { setValue(XMLTEMPORALSAMPLES, i); } +uint FractoriumSettings::XmlTemporalSamples() { return value(XMLTEMPORALSAMPLES).toUInt(); } +void FractoriumSettings::XmlTemporalSamples(uint i) { setValue(XMLTEMPORALSAMPLES, i); } -uint FractoriumSettings::XmlQuality() { return value(XMLQUALITY).toUInt(); } -void FractoriumSettings::XmlQuality(uint i) { setValue(XMLQUALITY, i); } +uint FractoriumSettings::XmlQuality() { return value(XMLQUALITY).toUInt(); } +void FractoriumSettings::XmlQuality(uint i) { setValue(XMLQUALITY, i); } -uint FractoriumSettings::XmlSupersample() { return value(XMLSUPERSAMPLE).toUInt(); } -void FractoriumSettings::XmlSupersample(uint i) { setValue(XMLSUPERSAMPLE, i); } +uint FractoriumSettings::XmlSupersample() { return value(XMLSUPERSAMPLE).toUInt(); } +void FractoriumSettings::XmlSupersample(uint i) { setValue(XMLSUPERSAMPLE, i); } -QString FractoriumSettings::Id() { return value(IDENTITYID).toString(); } -void FractoriumSettings::Id(const QString& s) { setValue(IDENTITYID, s); } +QString FractoriumSettings::Id() { return value(IDENTITYID).toString(); } +void FractoriumSettings::Id(const QString& s) { setValue(IDENTITYID, s); } -QString FractoriumSettings::Url() { return value(IDENTITYURL).toString(); } -void FractoriumSettings::Url(const QString& s) { setValue(IDENTITYURL, s); } +QString FractoriumSettings::Url() { return value(IDENTITYURL).toString(); } +void FractoriumSettings::Url(const QString& s) { setValue(IDENTITYURL, s); } -QString FractoriumSettings::Nick() { return value(IDENTITYNICK).toString(); } -void FractoriumSettings::Nick(const QString& s) { setValue(IDENTITYNICK, s); } +QString FractoriumSettings::Nick() { return value(IDENTITYNICK).toString(); } +void FractoriumSettings::Nick(const QString& s) { setValue(IDENTITYNICK, s); } /// /// General operations settings. /// -QString FractoriumSettings::OpenFolder() { return value(OPENFOLDER).toString(); } -void FractoriumSettings::OpenFolder(const QString& s) { setValue(OPENFOLDER, s); } +QString FractoriumSettings::OpenFolder() { return value(OPENFOLDER).toString(); } +void FractoriumSettings::OpenFolder(const QString& s) { setValue(OPENFOLDER, s); } -QString FractoriumSettings::SaveFolder() { return value(SAVEFOLDER).toString(); } -void FractoriumSettings::SaveFolder(const QString& s) { setValue(SAVEFOLDER, s); } +QString FractoriumSettings::SaveFolder() { return value(SAVEFOLDER).toString(); } +void FractoriumSettings::SaveFolder(const QString& s) { setValue(SAVEFOLDER, s); } -QString FractoriumSettings::OpenXmlExt() { return value(OPENXMLEXT).toString(); } -void FractoriumSettings::OpenXmlExt(const QString& s) { setValue(OPENXMLEXT, s); } +QString FractoriumSettings::OpenXmlExt() { return value(OPENXMLEXT).toString(); } +void FractoriumSettings::OpenXmlExt(const QString& s) { setValue(OPENXMLEXT, s); } -QString FractoriumSettings::SaveXmlExt() { return value(SAVEXMLEXT).toString(); } -void FractoriumSettings::SaveXmlExt(const QString& s) { setValue(SAVEXMLEXT, s); } +QString FractoriumSettings::SaveXmlExt() { return value(SAVEXMLEXT).toString(); } +void FractoriumSettings::SaveXmlExt(const QString& s) { setValue(SAVEXMLEXT, s); } -QString FractoriumSettings::OpenImageExt() { return value(OPENIMAGEEXT).toString(); } -void FractoriumSettings::OpenImageExt(const QString& s) { setValue(OPENIMAGEEXT, s); } +QString FractoriumSettings::OpenImageExt() { return value(OPENIMAGEEXT).toString(); } +void FractoriumSettings::OpenImageExt(const QString& s) { setValue(OPENIMAGEEXT, s); } -QString FractoriumSettings::SaveImageExt() { return value(SAVEIMAGEEXT).toString(); } -void FractoriumSettings::SaveImageExt(const QString& s) { setValue(SAVEIMAGEEXT, s); } +QString FractoriumSettings::SaveImageExt() { return value(SAVEIMAGEEXT).toString(); } +void FractoriumSettings::SaveImageExt(const QString& s) { setValue(SAVEIMAGEEXT, s); } -bool FractoriumSettings::SaveAutoUnique() { return value(AUTOUNIQUE).toBool(); } -void FractoriumSettings::SaveAutoUnique(bool b) { setValue(AUTOUNIQUE, b); } +bool FractoriumSettings::SaveAutoUnique() { return value(AUTOUNIQUE).toBool(); } +void FractoriumSettings::SaveAutoUnique(bool b) { setValue(AUTOUNIQUE, b); } -QMap FractoriumSettings::Variations() { return value(UIVARIATIONS).toMap(); } -void FractoriumSettings::Variations(const QMap& m) { setValue(UIVARIATIONS, m); } \ No newline at end of file +QMap FractoriumSettings::Variations() { return value(UIVARIATIONS).toMap(); } +void FractoriumSettings::Variations(const QMap& m) { setValue(UIVARIATIONS, m); } diff --git a/Source/Fractorium/FractoriumSettings.h b/Source/Fractorium/FractoriumSettings.h index 7c2e250..71c2754 100644 --- a/Source/Fractorium/FractoriumSettings.h +++ b/Source/Fractorium/FractoriumSettings.h @@ -13,8 +13,7 @@ #define DOUBLEPRECISION "render/dp64" #define CONTUPDATE "render/continuousupdate" #define SHOWALLXFORMS "render/dragshowallxforms" -#define PLATFORMINDEX "render/platformindex" -#define DEVICEINDEX "render/deviceindex" +#define DEVICES "render/devices" #define THREADCOUNT "render/threadcount" #define CPUDEFILTER "render/cpudefilter" #define OPENCLDEFILTER "render/opencldefilter" @@ -32,8 +31,7 @@ #define FINALKEEPASPECT "finalrender/keepaspect" #define FINALSCALE "finalrender/scale" #define FINALEXT "finalrender/ext" -#define FINALPLATFORMINDEX "finalrender/platformindex" -#define FINALDEVICEINDEX "finalrender/deviceindex" +#define FINALDEVICES "finalrender/devices" #define FINALTHREADCOUNT "finalrender/threadcount" #define FINALTHREADPRIORITY "finalrender/threadpriority" #define FINALQUALITY "finalrender/quality" @@ -95,14 +93,11 @@ public: bool ContinuousUpdate(); void ContinuousUpdate(bool b); - uint PlatformIndex(); - void PlatformIndex(uint b); - - uint DeviceIndex(); - void DeviceIndex(uint b); + QList Devices(); + void Devices(const QList& devices); uint ThreadCount(); - void ThreadCount(uint b); + void ThreadCount(uint i); bool CpuDEFilter(); void CpuDEFilter(bool b); @@ -111,10 +106,10 @@ public: void OpenCLDEFilter(bool b); uint CpuSubBatch(); - void CpuSubBatch(uint b); + void CpuSubBatch(uint i); uint OpenCLSubBatch(); - void OpenCLSubBatch(uint b); + void OpenCLSubBatch(uint i); bool FinalEarlyClip(); void FinalEarlyClip(bool b); @@ -149,16 +144,13 @@ public: QString FinalExt(); void FinalExt(const QString& s); - uint FinalPlatformIndex(); - void FinalPlatformIndex(uint b); - - uint FinalDeviceIndex(); - void FinalDeviceIndex(uint b); + QList FinalDevices(); + void FinalDevices(const QList& devices); uint FinalThreadCount(); - void FinalThreadCount(uint b); + void FinalThreadCount(uint i); - uint FinalThreadPriority(); + int FinalThreadPriority(); void FinalThreadPriority(int b); uint FinalQuality(); diff --git a/Source/Fractorium/FractoriumToolbar.cpp b/Source/Fractorium/FractoriumToolbar.cpp index 4e8f5e6..2d3f630 100644 --- a/Source/Fractorium/FractoriumToolbar.cpp +++ b/Source/Fractorium/FractoriumToolbar.cpp @@ -79,7 +79,14 @@ void Fractorium::OnActionDP(bool checked) /// void Fractorium::SyncOptionsToToolbar() { - if (m_Settings->OpenCL()) + static bool openCL = !OpenCLInfo::Instance().Devices().empty(); + + if (!openCL) + { + ui.ActionCL->setEnabled(false); + } + + if (openCL && m_Settings->OpenCL()) { ui.ActionCpu->setChecked(false); ui.ActionCL->setChecked(true); diff --git a/Source/Fractorium/FractoriumXformsColor.cpp b/Source/Fractorium/FractoriumXformsColor.cpp index 95efeaa..74076b4 100644 --- a/Source/Fractorium/FractoriumXformsColor.cpp +++ b/Source/Fractorium/FractoriumXformsColor.cpp @@ -240,9 +240,9 @@ void FractoriumEmberController::FillCurvesControl() { m_Fractorium->ui.CurvesView->blockSignals(true); - for (int i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) { - for (int j = 1; j < 3; j++)//Only do middle points. + for (size_t j = 1; j < 3; j++)//Only do middle points. { QPointF point(m_Ember.m_Curves.m_Points[i][j].x, m_Ember.m_Curves.m_Points[i][j].y); @@ -280,7 +280,7 @@ void FractoriumEmberController::FillColorWithXform(Xform* xform) /// The column of the cell void Fractorium::SetPaletteTableItem(QPixmap* pixmap, QTableWidget* table, QTableWidgetItem* item, int row, int col) { - if (pixmap) + if (pixmap && !pixmap->isNull()) { QSize size(table->columnWidth(col), table->rowHeight(row) + 1); item->setData(Qt::DecorationRole, pixmap->scaled(size, Qt::IgnoreAspectRatio, Qt::SmoothTransformation)); diff --git a/Source/Fractorium/FractoriumXformsVariations.cpp b/Source/Fractorium/FractoriumXformsVariations.cpp index fe38714..66160fa 100644 --- a/Source/Fractorium/FractoriumXformsVariations.cpp +++ b/Source/Fractorium/FractoriumXformsVariations.cpp @@ -50,7 +50,7 @@ void FractoriumEmberController::Filter(const QString& text) tree->setUpdatesEnabled(false); - for (uint i = 0; i < uint(tree->topLevelItemCount()); i++) + for (int i = 0; i < tree->topLevelItemCount(); i++) { if (auto item = dynamic_cast(tree->topLevelItem(i))) { @@ -181,14 +181,14 @@ void FractoriumEmberController::ClearVariationsTree() { QTreeWidget* tree = m_Fractorium->ui.VariationsTree; - for (uint i = 0; i < tree->topLevelItemCount(); i++) + for (int i = 0; i < tree->topLevelItemCount(); i++) { QTreeWidgetItem* item = tree->topLevelItem(i); auto* spinBox = dynamic_cast(tree->itemWidget(item, 1)); spinBox->SetValueStealth(0); - for (uint j = 0; j < item->childCount(); j++)//Iterate through all of the children, which will be the params. + for (int j = 0; j < item->childCount(); j++)//Iterate through all of the children, which will be the params. { if ((spinBox = dynamic_cast(tree->itemWidget(item->child(j), 1))))//Cast the child widget to the VariationTreeDoubleSpinBox type. spinBox->SetValueStealth(0); @@ -301,7 +301,7 @@ void FractoriumEmberController::FillVariationTreeWithXform(Xform* xform) tree->blockSignals(true); m_Fractorium->Filter(); - for (uint i = 0; i < tree->topLevelItemCount(); i++) + for (int i = 0; i < tree->topLevelItemCount(); i++) { auto item = dynamic_cast(tree->topLevelItem(i)); auto var = xform->GetVariationById(item->Id());//See if this variation in the tree was contained in the xform. @@ -317,7 +317,7 @@ void FractoriumEmberController::FillVariationTreeWithXform(Xform* xform) //item->setBackgroundColor(0, var ? Qt::darkGray : Qt::lightGray);//Ensure background is always white if the value goes to zero, else gray if var present. item->setBackgroundColor(0, var ? QColor(200, 200, 200) : QColor(255, 255, 255));//Ensure background is always white if the value goes to zero, else gray if var present. - for (uint j = 0; j < item->childCount(); j++)//Iterate through all of the children, which will be the params if it was a parametric variation. + for (int j = 0; j < item->childCount(); j++)//Iterate through all of the children, which will be the params if it was a parametric variation. { T* param = nullptr; auto childItem = item->child(j);//Get the child. diff --git a/Source/Fractorium/GLEmberController.cpp b/Source/Fractorium/GLEmberController.cpp index 2ddeaed..d7ac26f 100644 --- a/Source/Fractorium/GLEmberController.cpp +++ b/Source/Fractorium/GLEmberController.cpp @@ -252,13 +252,13 @@ void GLEmberController::QueryMatrices(bool print) if (print) { - for (int i = 0; i < 4; i++) + for (size_t i = 0; i < 4; i++) qDebug() << "Viewport[" << i << "] = " << m_Viewport[i] << endl; - for (int i = 0; i < 16; i++) + for (size_t i = 0; i < 16; i++) qDebug() << "Modelview[" << i << "] = " << glm::value_ptr(m_Modelview)[i] << endl; - for (int i = 0; i < 16; i++) + for (size_t i = 0; i < 16; i++) qDebug() << "Projection[" << i << "] = " << glm::value_ptr(m_Projection)[i] << endl; } } diff --git a/Source/Fractorium/GLWidget.cpp b/Source/Fractorium/GLWidget.cpp index b0be67c..0768969 100644 --- a/Source/Fractorium/GLWidget.cpp +++ b/Source/Fractorium/GLWidget.cpp @@ -313,7 +313,7 @@ void GLEmberController::DrawAffines(bool pre, bool post) { if (pre && m_Fractorium->DrawAllPre())//Draw all pre affine if specified. { - for (uint i = 0; i < ember->TotalXformCount(); i++) + for (size_t i = 0; i < ember->TotalXformCount(); i++) { Xform* xform = ember->GetTotalXform(i); bool selected = dragging ? (m_SelectedXform == xform) : (m_HoverXform == xform); @@ -328,7 +328,7 @@ void GLEmberController::DrawAffines(bool pre, bool post) if (post && m_Fractorium->DrawAllPost())//Draw all post affine if specified. { - for (uint i = 0; i < ember->TotalXformCount(); i++) + for (size_t i = 0; i < ember->TotalXformCount(); i++) { Xform* xform = ember->GetTotalXform(i); bool selected = dragging ? (m_SelectedXform == xform) : (m_HoverXform == xform); @@ -1025,7 +1025,7 @@ void GLWidget::DrawAffineHelper(int index, bool selected, bool pre, bool final, if (selected) { - for (int i = 1; i <= 64; i++)//The circle. + for (size_t i = 1; i <= 64; i++)//The circle. { float theta = float(M_PI) * 2.0f * float(i % 64) / 64.0f; float fx = float(cos(theta)); @@ -1100,7 +1100,7 @@ int GLEmberController::UpdateHover(v3T& glCoords) } //Check all xforms. - for (uint i = 0; i < ember->TotalXformCount(); i++) + for (size_t i = 0; i < ember->TotalXformCount(); i++) { Xform* xform = ember->GetTotalXform(i); diff --git a/Source/Fractorium/Main.cpp b/Source/Fractorium/Main.cpp index 999f2a5..806b586 100644 --- a/Source/Fractorium/Main.cpp +++ b/Source/Fractorium/Main.cpp @@ -90,6 +90,10 @@ int main(int argc, char *argv[]) a.installEventFilter(&w); rv = a.exec(); } + catch (const std::exception& e) + { + QMessageBox::critical(0, "Fatal Error", QString::fromStdString(e.what())); + } catch (const char* e) { QMessageBox::critical(0, "Fatal Error", e); diff --git a/Source/Fractorium/OptionsDialog.cpp b/Source/Fractorium/OptionsDialog.cpp index 4048c4c..f17276d 100644 --- a/Source/Fractorium/OptionsDialog.cpp +++ b/Source/Fractorium/OptionsDialog.cpp @@ -9,17 +9,17 @@ /// The parent widget. Default: nullptr. /// The window flags. Default: 0. FractoriumOptionsDialog::FractoriumOptionsDialog(FractoriumSettings* settings, QWidget* p, Qt::WindowFlags f) - : QDialog(p, f) + : QDialog(p, f), + m_Info(OpenCLInfo::Instance()) { - int row = 0, spinHeight = 20; - uint i; + int i, row = 0, spinHeight = 20; ui.setupUi(this); m_Settings = settings; QTableWidget* table = ui.OptionsXmlSavingTable; ui.ThreadCountSpin->setRange(1, Timing::ProcessorCount()); - connect(ui.OpenCLCheckBox, SIGNAL(stateChanged(int)), this, SLOT(OnOpenCLCheckBoxStateChanged(int)), Qt::QueuedConnection); - connect(ui.PlatformCombo, SIGNAL(currentIndexChanged(int)), this, SLOT(OnPlatformComboCurrentIndexChanged(int)), Qt::QueuedConnection); + connect(ui.OpenCLCheckBox, SIGNAL(stateChanged(int)), this, SLOT(OnOpenCLCheckBoxStateChanged(int)), Qt::QueuedConnection); + connect(ui.DeviceTable, SIGNAL(cellChanged(int, int)), this, SLOT(OnDeviceTableCellChanged(int, int)), Qt::QueuedConnection); SetupSpinner(table, this, row, 1, m_XmlTemporalSamplesSpin, spinHeight, 1, 1000, 100, "", "", true, 1000); SetupSpinner(table, this, row, 1, m_XmlQualitySpin, spinHeight, 1, 200000, 50, "", "", true, 1000); @@ -34,63 +34,28 @@ FractoriumOptionsDialog::FractoriumOptionsDialog(FractoriumSettings* settings, Q m_NickEdit = new QLineEdit(ui.OptionsIdentityTable); ui.OptionsIdentityTable->setCellWidget(2, 1, m_NickEdit); - m_IdEdit->setText(m_Settings->Id()); - m_UrlEdit->setText(m_Settings->Url()); - m_NickEdit->setText(m_Settings->Nick()); + table = ui.DeviceTable; - if (m_Wrapper.CheckOpenCL()) + if (m_Info.Ok() && !m_Info.Devices().empty()) { - vector platforms = m_Wrapper.PlatformNames(); + SetupDeviceTable(table, m_Settings->Devices()); - //Populate combo boxes with available OpenCL platforms and devices. - for (i = 0; i < platforms.size(); i++) - ui.PlatformCombo->addItem(QString::fromStdString(platforms[i])); - - //If init succeeds, set the selected platform and device combos to match what was saved in the settings. - if (m_Wrapper.Init(m_Settings->PlatformIndex(), m_Settings->DeviceIndex())) - { - ui.OpenCLCheckBox->setChecked( m_Settings->OpenCL()); - ui.PlatformCombo->setCurrentIndex(m_Settings->PlatformIndex()); - ui.DeviceCombo->setCurrentIndex( m_Settings->DeviceIndex()); - } - else - { - OnPlatformComboCurrentIndexChanged(0); - ui.OpenCLCheckBox->setChecked(false); - } + for (i = 0; i < table->rowCount(); i++) + if (auto radio = qobject_cast(table->cellWidget(i, 1))) + connect(radio, SIGNAL(toggled(bool)), this, SLOT(OnDeviceTableRadioToggled(bool)), Qt::QueuedConnection); } else { + ui.DeviceTable->setEnabled(false); ui.OpenCLCheckBox->setChecked(false); ui.OpenCLCheckBox->setEnabled(false); + ui.OpenCLSubBatchSpin->setEnabled(false); + ui.OpenCLFilteringDERadioButton->setEnabled(false); + ui.OpenCLFilteringLogRadioButton->setEnabled(false); + ui.InteraciveGpuFilteringGroupBox->setEnabled(false); } - ui.EarlyClipCheckBox->setChecked( m_Settings->EarlyClip()); - ui.YAxisUpCheckBox->setChecked( m_Settings->YAxisUp()); - ui.TransparencyCheckBox->setChecked( m_Settings->Transparency()); - ui.ContinuousUpdateCheckBox->setChecked(m_Settings->ContinuousUpdate()); - ui.DoublePrecisionCheckBox->setChecked( m_Settings->Double()); - ui.ShowAllXformsCheckBox->setChecked( m_Settings->ShowAllXforms()); - ui.ThreadCountSpin->setValue( m_Settings->ThreadCount()); - - if (m_Settings->CpuDEFilter()) - ui.CpuFilteringDERadioButton->setChecked(true); - else - ui.CpuFilteringLogRadioButton->setChecked(true); - - if (m_Settings->OpenCLDEFilter()) - ui.OpenCLFilteringDERadioButton->setChecked(true); - else - ui.OpenCLFilteringLogRadioButton->setChecked(true); - - ui.CpuSubBatchSpin->setValue(m_Settings->CpuSubBatch()); - ui.OpenCLSubBatchSpin->setValue(m_Settings->OpenCLSubBatch()); - - m_XmlTemporalSamplesSpin->setValue(m_Settings->XmlTemporalSamples()); - m_XmlQualitySpin->setValue(m_Settings->XmlQuality()); - m_XmlSupersampleSpin->setValue(m_Settings->XmlSupersample()); - ui.AutoUniqueCheckBox->setChecked(m_Settings->SaveAutoUnique()); - + DataToGui(); OnOpenCLCheckBoxStateChanged(ui.OpenCLCheckBox->isChecked()); } @@ -106,12 +71,48 @@ bool FractoriumOptionsDialog::OpenCL() { return ui.OpenCLCheckBox->isChecked(); bool FractoriumOptionsDialog::Double() { return ui.DoublePrecisionCheckBox->isChecked(); } bool FractoriumOptionsDialog::ShowAllXforms() { return ui.ShowAllXformsCheckBox->isChecked(); } bool FractoriumOptionsDialog::AutoUnique() { return ui.AutoUniqueCheckBox->isChecked(); } -uint FractoriumOptionsDialog::PlatformIndex() { return ui.PlatformCombo->currentIndex(); } -uint FractoriumOptionsDialog::DeviceIndex() { return ui.DeviceCombo->currentIndex(); } uint FractoriumOptionsDialog::ThreadCount() { return ui.ThreadCountSpin->value(); } /// -/// Disable or enable the OpenCL related controls based on the state passed in. +/// The check state of one of the OpenCL devices was changed. +/// This does a special check to always ensure at least one device, +/// as well as one primary is checked. +/// +/// The row of the cell +/// The column of the cell +void FractoriumOptionsDialog::OnDeviceTableCellChanged(int row, int col) +{ + if (auto item = ui.DeviceTable->item(row, col)) + HandleDeviceTableCheckChanged(ui.DeviceTable, row, col); +} + +/// +/// The primary device radio button selection was changed. +/// If the device was specified as primary, but was not selected +/// for inclusion, it will automatically be selected for inclusion. +/// +/// The state of the radio button +void FractoriumOptionsDialog::OnDeviceTableRadioToggled(bool checked) +{ + int row; + auto s = sender(); + auto table = ui.DeviceTable; + QRadioButton* radio = nullptr; + + if (s) + { + for (row = 0; row < table->rowCount(); row++) + if (radio = qobject_cast(table->cellWidget(row, 1))) + if (s == radio) + { + HandleDeviceTableCheckChanged(ui.DeviceTable, row, 1); + break; + } + } +} + +/// +/// Disable or enable the CPU and OpenCL related controls based on the state passed in. /// Called when the state of the OpenCL checkbox is changed. /// /// The state of the checkbox @@ -119,28 +120,15 @@ void FractoriumOptionsDialog::OnOpenCLCheckBoxStateChanged(int state) { bool checked = state == Qt::Checked; - ui.PlatformCombo->setEnabled(checked); - ui.DeviceCombo->setEnabled(checked); + ui.DeviceTable->setEnabled(checked); ui.ThreadCountSpin->setEnabled(!checked); -} - -/// -/// Populate the the device combo box with all available -/// OpenCL devices for the selected platform. -/// Called when the platform combo box index changes. -/// -/// The selected index of the combo box -void FractoriumOptionsDialog::OnPlatformComboCurrentIndexChanged(int index) -{ - vector devices = m_Wrapper.DeviceNames(index); - - ui.DeviceCombo->clear(); - - for (auto& device : devices) - ui.DeviceCombo->addItem(QString::fromStdString(device)); - - if (ui.PlatformCombo->currentIndex() == m_Settings->PlatformIndex()) - ui.DeviceCombo->setCurrentIndex(m_Settings->DeviceIndex()); + ui.CpuSubBatchSpin->setEnabled(!checked); + ui.OpenCLSubBatchSpin->setEnabled(checked); + ui.CpuFilteringDERadioButton->setEnabled(!checked); + ui.CpuFilteringLogRadioButton->setEnabled(!checked); + ui.OpenCLFilteringDERadioButton->setEnabled(checked); + ui.OpenCLFilteringLogRadioButton->setEnabled(checked); + ui.InteraciveGpuFilteringGroupBox->setEnabled(checked); } /// @@ -187,13 +175,12 @@ void FractoriumOptionsDialog::GuiToData() m_Settings->OpenCL(OpenCL()); m_Settings->Double(Double()); m_Settings->ShowAllXforms(ShowAllXforms()); - m_Settings->PlatformIndex(PlatformIndex()); - m_Settings->DeviceIndex(DeviceIndex()); m_Settings->ThreadCount(ThreadCount()); m_Settings->CpuSubBatch(ui.CpuSubBatchSpin->value()); m_Settings->OpenCLSubBatch(ui.OpenCLSubBatchSpin->value()); m_Settings->CpuDEFilter(ui.CpuFilteringDERadioButton->isChecked()); m_Settings->OpenCLDEFilter(ui.OpenCLFilteringDERadioButton->isChecked()); + m_Settings->Devices(DeviceTableToSettings(ui.DeviceTable)); //Xml saving. m_Settings->XmlTemporalSamples(m_XmlTemporalSamplesSpin->value()); @@ -213,6 +200,8 @@ void FractoriumOptionsDialog::GuiToData() void FractoriumOptionsDialog::DataToGui() { //Interactive rendering. + auto devices = m_Settings->Devices(); + ui.EarlyClipCheckBox->setChecked(m_Settings->EarlyClip()); ui.YAxisUpCheckBox->setChecked(m_Settings->YAxisUp()); ui.TransparencyCheckBox->setChecked(m_Settings->Transparency()); @@ -220,13 +209,20 @@ void FractoriumOptionsDialog::DataToGui() ui.OpenCLCheckBox->setChecked(m_Settings->OpenCL()); ui.DoublePrecisionCheckBox->setChecked(m_Settings->Double()); ui.ShowAllXformsCheckBox->setChecked(m_Settings->ShowAllXforms()); - ui.PlatformCombo->setCurrentIndex(m_Settings->PlatformIndex()); - ui.DeviceCombo->setCurrentIndex(m_Settings->DeviceIndex()); ui.ThreadCountSpin->setValue(m_Settings->ThreadCount()); ui.CpuSubBatchSpin->setValue(m_Settings->CpuSubBatch()); ui.OpenCLSubBatchSpin->setValue(m_Settings->OpenCLSubBatch()); - ui.CpuFilteringDERadioButton->setChecked(m_Settings->CpuDEFilter()); - ui.OpenCLFilteringDERadioButton->setChecked(m_Settings->OpenCLDEFilter()); + SettingsToDeviceTable(ui.DeviceTable, devices); + + if (m_Settings->CpuDEFilter()) + ui.CpuFilteringDERadioButton->setChecked(true); + else + ui.CpuFilteringLogRadioButton->setChecked(true); + + if (m_Settings->OpenCLDEFilter()) + ui.OpenCLFilteringDERadioButton->setChecked(true); + else + ui.OpenCLFilteringLogRadioButton->setChecked(true); //Xml saving. m_XmlTemporalSamplesSpin->setValue(m_Settings->XmlTemporalSamples()); @@ -238,4 +234,4 @@ void FractoriumOptionsDialog::DataToGui() m_IdEdit->setText(m_Settings->Id()); m_UrlEdit->setText(m_Settings->Url()); m_NickEdit->setText(m_Settings->Nick()); -} \ No newline at end of file +} diff --git a/Source/Fractorium/OptionsDialog.h b/Source/Fractorium/OptionsDialog.h index 2e6eeb9..4fc02ca 100644 --- a/Source/Fractorium/OptionsDialog.h +++ b/Source/Fractorium/OptionsDialog.h @@ -28,7 +28,8 @@ public: public slots: void OnOpenCLCheckBoxStateChanged(int state); - void OnPlatformComboCurrentIndexChanged(int index); + void OnDeviceTableCellChanged(int row, int col); + void OnDeviceTableRadioToggled(bool checked); virtual void accept() override; virtual void reject() override; @@ -45,14 +46,12 @@ private: bool Double(); bool ShowAllXforms(); bool AutoUnique(); - uint PlatformIndex(); - uint DeviceIndex(); uint ThreadCount(); void DataToGui(); void GuiToData(); Ui::OptionsDialog ui; - OpenCLWrapper m_Wrapper; + OpenCLInfo& m_Info; SpinBox* m_XmlTemporalSamplesSpin; SpinBox* m_XmlQualitySpin; SpinBox* m_XmlSupersampleSpin; diff --git a/Source/Fractorium/OptionsDialog.ui b/Source/Fractorium/OptionsDialog.ui index a11eae5..3ca93c9 100644 --- a/Source/Fractorium/OptionsDialog.ui +++ b/Source/Fractorium/OptionsDialog.ui @@ -6,31 +6,34 @@ 0 0 - 300 - 368 + 427 + 415 - + 0 0 - 300 - 368 + 427 + 415 - 300 - 368 + 16777215 + 415 Options + + true + 6 @@ -72,6 +75,9 @@ Interactive Rendering + + QFormLayout::AllNonFixedFieldsGrow + 4 @@ -103,6 +109,36 @@ + + + + <html><head/><body><p>Use OpenCL to render if your video card supports it.</p><p>This is highly recommended as it will give fluid, real-time interactive editing.</p></body></html> + + + Use OpenCL + + + + + + + <html><head/><body><p>Checked: Positive Y direction is up.</p><p>Unchecked: Positive Y direction is down.</p></body></html> + + + Positive Y Up + + + + + + + <html><head/><body><p>Checked: use 64-bit double precision numbers (slower, but better image quality).</p><p>Unchecked: use 32-bit single precision numbers (faster, but worse image quality).</p></body></html> + + + Use Double Precision + + + @@ -113,13 +149,115 @@ + + + + <html><head/><body><p>Checked: show all xforms while dragging.</p><p>Unchecked: only show current xform while dragging.</p></body></html> + + + Show All Xforms + + + + + + + <html><head/><body><p>Continually update output image during interactive rendering.</p><p>This will slow down performance, but will give continuous updates on how the final render will look. Note that only log scale filtering is applied on each update. Full DE is not applied until iteration is complete.</p></body></html> + + + Continuous Update + + + - + + + + 0 + 0 + + + + + 0 + 91 + + + + + 16777215 + 91 + + + + Qt::NoFocus + + + QAbstractItemView::NoEditTriggers + + + QAbstractItemView::NoSelection + + + QAbstractItemView::SelectRows + + + true + + + 3 + + + 60 + + + true + + + false + + + 22 + + + false + + + 22 + + + + AMD + + + + + Nvidia + + + + + Intel + + + + + Use + + + + + Primary + + + + + Device + + + - - - - + <html><head/><body><p>The number of threads to use with CPU rendering.</p><p>Decrease for more responsive editing, increase for better performance.</p></body></html> @@ -135,7 +273,41 @@ - + + + + The number of 10,000 iteration chunks ran per thread on the CPU +in interactive mode for each mouse movement + + + CPU Sub Batch + + + 1 + + + 100 + + + + + + + The number of ~8M iteration chunks ran using OpenCL +in interactive mode for each mouse movement + + + OpenCL Sub Batch + + + 1 + + + 100 + + + + CPU Filtering @@ -182,7 +354,7 @@ - + OpenCL Filtering @@ -226,90 +398,6 @@ - - - - The number of 10,000 iteration chunks ran per thread on the CPU -in interactive mode for each mouse movement - - - CPU Sub Batch - - - 1 - - - 100 - - - - - - - The number of ~8M iteration chunks ran using OpenCL -in interactive mode for each mouse movement - - - OpenCL Sub Batch - - - 1 - - - 100 - - - - - - - <html><head/><body><p>Checked: Positive Y direction is up.</p><p>Unchecked: Positive Y direction is down.</p></body></html> - - - Positive Y Up - - - - - - - <html><head/><body><p>Use OpenCL to render if your video card supports it.</p><p>This is highly recommended as it will give fluid, real-time interactive editing.</p></body></html> - - - Use OpenCL - - - - - - - <html><head/><body><p>Checked: use 64-bit double precision numbers (slower, but better image quality).</p><p>Unchecked: use 32-bit single precision numbers (faster, but worse image quality).</p></body></html> - - - Use Double Precision - - - - - - - <html><head/><body><p>Checked: show all xforms while dragging.</p><p>Unchecked: only show current xform while dragging.</p></body></html> - - - Show All Xforms - - - - - - - <html><head/><body><p>Continually update output image during interactive rendering.</p><p>This will slow down performance, but will give continuous updates on how the final render will look. Note that only log scale filtering is applied on each update. Full DE is not applied until iteration is complete.</p></body></html> - - - Continuous Update - - - @@ -732,8 +820,6 @@ in interactive mode for each mouse movement TransparencyCheckBox ShowAllXformsCheckBox ContinuousUpdateCheckBox - PlatformCombo - DeviceCombo ThreadCountSpin CpuSubBatchSpin OpenCLSubBatchSpin