Releases: RenderKit/oidn
Releases · RenderKit/oidn
Open Image Denoise v2.5.0
- Significantly improved performance and reduced memory usage on Intel GPUs with XMX support and CPUs with AMX-FP16 support
- Added API for importing external semaphores from graphics APIs (e.g. Vulkan, Direct3D 12). Currently this is supported only by CUDA (Windows and Linux) and HIP (Windows only) devices. SYCL device support will be added in a future version
- Added the
OIDN_EXTERNAL_MEMORY_TYPE_FLAG_DEDICATEDflag which must be combined with the handle type flag when importing external memory with dedicated allocation - Fixed corrupted output on Apple M5 Pro/Max GPUs using Metal
- Fixed a crash caused by over-releasing the MTLDevice, which could occur after creating and destroying multiple devices
- Fixed device detection failure or crash on Windows if some old Intel integrated GPU drivers are installed (fix requires building with oneAPI DPC++ Compiler 6.1.0)
Open Image Denoise v2.4.1
- Added AMD RDNA 3.5 GFX1152 GPU support
Open Image Denoise v2.4.0
- Added Intel AMX-FP16 support, dramatically improving performance on Intel Granite Rapids CPUs
- Added Intel BMG-G31, Wildcat Lake, Nova Lake, and Crescent Island GPU support
- Added AMD RDNA 3.5 GPU support, extended RDNA 2 support
- Fixed integer overflow and out-of-bounds write issues in image loaders (only affects the
oidnDenoiseexample application) - Added support for compilation with CUDA 13
- Added support for compilation with ROCm 7
- Removed support for NVIDIA Volta GPUs
Open Image Denoise v2.3.3
- Added NVIDIA Blackwell GPU support
- Added AMD RDNA4 GPU support
- Improved performance for AMD RDNA3 GPUs
- Added
OIDN_DEPENDENTLOADFLAGCMake option for setting theDEPENDENTLOADFLAGlinker flag on Windows - Added
OIDN_LIBRARY_VERSIONEDCMake option for toggling versioning in the Open Image Denoise library files - Known issue: performance regression for AMD RDNA2 GPUs
Open Image Denoise v2.3.2
- Improved performance for Intel Lunar Lake and Battlemage GPUs
- Added Intel Panther Lake GPU support
- Fixed compile error when building with OpenImageIO 3.x
Open Image Denoise v2.3.1
- Fixed corrupted output when in-place denoising high-resolution (> 1080p)
images where the input and output are stored in different shared buffer
objects (created withoidnNewSharedBuffer*) that overlap in memory - Fixed issues with cancellation through progress monitor callbacks:
- Fixed cancellation requests not being fulfilled on CPU devices since
v2.3.0 - Fixed not calling the callback anymore after requesting cancellation,
while the operation is still being executed
- Fixed cancellation requests not being fulfilled on CPU devices since
- Added support for creating shared buffers on Metal devices
- Enabled accessing system allocated memory for CUDA devices which support this
feature (seesystemMemorySupporteddevice parameter) - Added LUID support for HIP devices. Importing DX12 and Vulkan buffers is
now functional when using recent AMD GPU drivers on Windows
Open Image Denoise v2.3.0
- Significantly improved image quality of the
RTfilter in high quality
mode for HDR denoising with prefiltering, i.e., the following combinations
of input features and parameters:
- HDR color + albedo + normal +cleanAux
- albedo
- normal
In these cases a much more complex filter is used, which results in lower
performance than before (about 2x). To revert to the previous performance
behavior, please switch to the balanced quality mode. - Added fast quality mode (
OIDN_QUALITY_FAST) for even higher performance
(about 1.5-2x) interactive/real-time previews and lower default memory usage
at the cost of somewhat lower image quality. Currently this is implemented
for theRTfilter except prefiltering (albedo, normal). In other cases
denoising implicitly falls back to balanced mode. - Added Intel Arrow Lake, Lunar Lake, and Battlemage GPU support
- Execute
Asyncfunctions asynchronously on CPU devices as well - Load/initialize device modules lazily (improves stability)
- Added
oidnIsCPUDeviceSupported,oidnIsSYCLDeviceSupported,
oidnIsCUDADeviceSupported,oidnIsHIPDeviceSupported,
andoidnIsMetalDeviceSupportedAPI functions for checking whether a
physical device of a particular type is supported - Release the CUDA primary context when destroying the device object if using
the CUDA driver API - Added
OIDN_LIBRARY_NAMECMake option for setting the base name of the Open
Image Denoise library files - Fixed device creation error with
oidnNewDevicewhen the default device of
the specified type (e.g. CUDA) is not supported but there are other
supported non-default devices of that type in the system - Fixed CMake error when building with Metal support using non-Apple Clang
- Fixed iOS build errors
- Added support for building with ROCm 6.x
oidnNewCUDADeviceandoidnNewHIPDeviceno longer accept negative device
IDs. If the goal is to use the current device, its actual ID needs to be
passed.- Upgraded to oneTBB 2021.12.0 in the official binaries
- Training:
- Improved training performance on CUDA and MPS devices, added
--compile
option - Added
--qualityoption (high,balanced,fast) for selecting the
size of the model to train, changed the default frombalancedtohigh - Added new models to the
--modeloption (unet_small,unet_large,
unet_xl) - Added support for training with prefiltered auxiliary features by
passing--aux_resultstopreprocess.pyandtrain.py - Added experimental support for depth (
z)
- Improved training performance on CUDA and MPS devices, added
Open Image Denoise v2.3.0-beta
- Significantly improved image quality of the
RTfilter in high quality
mode for HDR denoising with prefiltering, i.e., the following combinations
of input features and parameters:
- HDR color + albedo + normal +cleanAux
- albedo
- normal
In these cases a much more complex filter is used, which results in lower
performance than before (about 2x). To revert to the previous performance
behavior, please switch to the balanced quality mode. - Added fast quality mode (
OIDN_QUALITY_FAST) for even higher performance
(about 1.5-2x) interactive/real-time previews and lower default memory usage
at the cost of somewhat lower image quality. Currently this is implemented
for theRTfilter except prefiltering (albedo, normal). In other cases
denoising implicitly falls back to balanced mode. - Execute
Asyncfunctions asynchronously on CPU devices as well - Load/initialize device modules lazily (improves stability)
- Added
oidnIsCPUDeviceSupported,oidnIsSYCLDeviceSupported,
oidnIsCUDADeviceSupported,oidnIsHIPDeviceSupported,
andoidnIsMetalDeviceSupportedAPI functions for checking whether a
physical device of a particular type is supported - Release the CUDA primary context when destroying the device object if using
the CUDA driver API - Fixed device creation error with
oidnNewDevicewhen the default device of
the specified type (e.g. CUDA) is not supported but there are other
supported non-default devices of that type in the system - Added support for building with ROCm 6.x
oidnNewCUDADeviceandoidnNewHIPDeviceno longer accept negative device
IDs. If the goal is to use the current device, its actual ID needs to be
passed.- Upgraded to oneTBB 2021.12.0 in the official binaries
Open Image Denoise v2.2.2
- Fully fixed GPU memory leak when releasing SYCL, CUDA and HIP device objects
- Fixed CUDA context error in some cases when using the CUDA driver API
- Fixed crash on systems with an unsupported AMD Vega integrated GPU and old
driver
Open Image Denoise v2.2.1
- Fixed memory leak when releasing SYCL, CUDA and HIP device objects
- Fixed memory leak when initializing Metal filters